AD-A192  238 


* 


TABLE  OF  CONTENTS', 


1.  ^he  Application  of  Inverse  Theory  to  Seamount  Magnetism;  by 

Robert  L.  Parker,  Loren  Shure,  and  John  A.  Hildebrand. 

2.  Frequency  Dependent  Polarization  Analysis  of  High-Frequency 
Seismograms,'  by  Jeffrey  Park,  Frank  L.  Vernon  III,  and  Craig 
R.  Lindberg. 

3.  Multiple-Taper  Spectral  Analysis  of  Terrestrial  Free 
Oscillations:'  Part  I,  by  Jeffrey  Park,  Craig  R.  Lindberg, 
and  David  J.  Thomson. 

4.  -Multitaper  Spectral  Analysis  of  High-Frequency  Seismograms, 

by  Jeffrey  Park,  C  ra  i  g  R.  Lindberg,  and  Frank  L.  Vernon  III. 

5.  Strict  Bounds  on  Seismic  Velocity  in  the  Spherical  Earth,  by 
Philip  B.  Stark,  Robert  L.  Parker,  G.  Masters,  and  John  A. 

0  rcutt . 

6.  Velocity  Bounds  from  Statistical  Estimates  of  t(p)  and  x(p)„ 
by  Philip  B.  Stark  and  Robert  L.  Parker. 


!  Acoesalra  For 

NTIS 

ORA*  I 

n 

j  OTIC 

TAB 

Unarmouuoed 

■  Juatifioat J 

“v  _ 

•  : attribution/ 

Avfli 

iabillt  y 

'■a 

Avail 

tji  >•  ■  • 

— 

f  in 

ft4| 

Spoo 

_ 1 

Ini 

REVIEWS  OF  GEOPHYSICS,  VOL.  25.  NO.  1.  PAGES  17  40,  FEBRUARY  1987 


The  Application  of  Inverse  Theory  to  Seamount  Magnetism 

Robert  L.  Parker 

Institute  of  Geophysics  and  Planetary  Physics.  Scripps  Institution  of  Oceanography,  University  of  California,  San  Diego 

Loren  Shure1 

Woods  Hole  Oceanographic  Institute,  Woods  Hole,  Massachussetts 

John  A.  Hildebrand 

Marine  Physical  Laboratory,  Scripps  Institution  of  Oceanography,  University  of  California,  San  Diego 

The  traditional  least  squares  method  for  modeling  seamount  magnetism  is  often  unsatisfactory 
because  the  models  fail  to  reproduce  the  observations  accurately.  We  describe  an  alternative 
approach  permitting  a  more  complex  internal  structure,  guaranteed  to  generate  an  external  field  in 
close  agreement  with  the  observed  anomaly.  Potential  field  inverse  problems  like  this  one  are  fun¬ 
damentally  incapable  of  a  unique  solution,  and  some  criterion  is  mandatory  for  picking  a  plausible 
representative  from  the  infinite-dimensional  space  of  models  all  satisfying  the  data.  Most  of  the 
candidates  are  unacceptable  geologically  because  they  contain  huge  magnetic  intensities  or  rapid 
variations  of  magnetization  on  fine  scales.  To  avoid  such  undesirable  attributes,  we  construct  the 
simplest  type  of  model:  the  one  closest  to  a  uniform  solution  as  measured  by  the  norm  in  a  spe¬ 
cially  chosen  Hilbert  space  of  magnetization  functions  found  by  a  procedure  called  seminorm 
minimization.  Because  our  solution  is  the  most  nearly  uniform  one  we  can  say  with  certainty  that 
any  other  magnetization  satisfying  the  data  must  be  at  least  as  complex  as  ours  The  theory 
accounts  for  the  complicated  shape  of  seamounts,  representing  the  body  by  a  covering  of  triangular 
facets.  We  show  that  the  special  choice  of  Hilbert  space  allows  the  necessary  volume  integrals  to  be 
reduced  to  surface  integrals  over  the  seamount  surface,  and  we  present  numerical  techniques  for 
their  evaluation.  Exact  agreement  with  the  magnetic  data  cannot  be  expected  because  of  the  error 
of  approximating  the  shape  and  because  the  measured  fields  contain  noise  of  crustal,  ionospheric, 
and  magnetospheric  origin.  We  examine  the  potential  size  of  the  various  error  terms  and  find  that 
those  caused  by  approximation  of  the  shape  are  generally  much  smaller  than  the  rest.  The  mean 
magnetization  is  a  vector  that  can  in  principle  be  discovered  from  exact  knowledge  of  the  external 
field  of  the  seamount;  this  vector  is  of  primary  importance  for  pale  (magnetic  work  We  study  ihe 
question  of  how  large  the  uncertainty  in  the  mean  vector  may  be,  based  on  actual  noise,  as 
opposed  to  exact,  data;  the  uncertainly  can  be  limited  only  by  further  assumptions  about  the  inter¬ 
nal  magnetization.  We  choose  to  bound  the  rms  intensity.  In  an  application  to  a  young  seamount 
in  the  Louisville  Ridge  chain  we  find  that  remarkably  little  nonuniformily  is  required  to  obtain 
excellent  agreement  with  the  observed  anomaly  while  the  uniform  magnetization  gives  a  poor  fit 
The  paleopole  position  of  ordinary  least  squares  solution  lies  over  .70°  away  from  Ihe  geographic 
north,  but  the  pole  derived  from  our  seminorm  minimizing  model  is  very  near  the  north  pole  as  it 
should  be.  A  calculation  of  the  sensitivity  of  ihe  mean  magnetization  vector  to  the  location  of  the 
magnetic  observations  shows  that  the  data  on  the  perimeter  of  the  survey  were  given  the  greatest 
weight  and  suggests  that  enlargement  of  the  survey  area  might  further  improve  ihe  reliability  of  the 
results. 
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1.  Introduction 

Ever  since  C.  Darwin  deduced  the  general  subsidence  of 
ihe  seafloor  from  his  observations  of  coral  atolls. 
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seamounts  have  been  valuable  sources  of  information 
about  marine  geology.  The  existence  of  chains  of 
seamounts  is  the  best  evidence  for  deep  hot  spots  station¬ 
ary  in  the  mantle  [ Morgan ,  1971],  and  the  regular  l  iteral 
spacing  of  these  rows  may  indicate  the  presence  of  longi¬ 
tudinal  convection  cells  [Richter  and  Parsons,  1975).  The 
statistics  of  seamount  distribution  has  been  used  to  pro¬ 
vide  information  on  the  variability  of  tectonic  activity 
throughout  recent  geologic  past  [ Batiza ,  1982],  The 
weight  of  the  seamount  is  a  load  that  deforms  the  ocean 
crust,  and  analysis  of  the  bathymetry  of  the  seafloor 
around  a  seamount  yields  estimates  of  the  strength  of  the 
oceanic  lithosphere  [McNutt  and  Menard,  1978]  that  con¬ 
tributes  to  our  understanding  of  the  thermal  evolution  of 
lithospheric  plates  [Watts  etal.,  1980], 

The  first  quantitative  geophysical  studies  of  seamounts 
concerned  their  magnetism.  Vacquier  [1962]  developed  a 
method  for  calculating  an  average  magnetization  vector 
using  observations  of  the  magnetic  field  anomaly  and  the 
bathymetric  contour  of  the  seamount.  This  method 
approximated  the  seamount  body  with  rectangular  prisms 
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and  assumed  that  all  the  prisms  were  of  uniform  magneti¬ 
zation  in  order  to  calculate  the  average  magnetization  with 
the  minimum  least  squares  error  in  the  fit  of  the  anomaly. 
Taiwan i  [1965]  modified  the  least  squares  method  by 
representing  the  body  as  a  collection  of  horizontal  polygo¬ 
nal  laminas  whose  outlines  followed  the  contours  of  the 
body,  and  Plouff  [1976]  refined  this  method  by  increasing 
the  accuracy  of  the  integration  in  the  vertical  direction. 
The  results  from  the  least  squares  modeling  program  were 
used  for  paleomagnetic  study  of  seamounts  [Uyeda  and 
R-.ckards ,  1966;  Richards  et  al.,  19o7;  Vac^uier  and  Uyeda, 
1967],  and  as  soon  as  the  ideas  of  seafloor  spreading 
became  accepted,  such  analyses  were  applied  to  the  unrav¬ 
eling  of  the  history  of  the  ocean  basins  [Francheteau  ei  al., 
1970;  Harrison  er  al.,  1975;  Gordon  and  Cox,  1980;  Sager, 
1983].  Despite  widespread  use  and  some  suggestive 
findings,  the  results  of  this  program  have  been  rather 
disappointing.  The  scatter  of  paleopoles  derived  from 
apparently  homogeneous  groups  of  seamounts  is  often 
large,  and  the  portion  of  the  magnetic  anomaly  accounted 
for  by  the  model  is  often  small. 

We  believe  the  generally  unsatisfactory  performance  is 
due  in  large  measure  to  limitations  of  the  least  squares 
method  of  analysis  of  the  magnetic  data.  The  interior  of 
the  seamount  is  assumed  to  be  uniformly  magnetized 
down  to  its  base,  the  plane  of  the  surrounding  seafloor, 
and  below  this  level  the  magnetization  is  the  same  as  in 
the  surrounding  oceanic  crust.  In  its  barest  form,  the  least 
squares  model  contains  just  three  unknown  parameters: 
the  three  components  of  the  magnetization  vector.  It  is 
generally  necessary  to  include  as  unknowns  the  parameters 
of  a  local  background  field  varying  linearly  across  the  sur¬ 
vey  region  to  correct  for  small  errors  in  the  computation 
of  the  anomaly  from  total  field  measurements;  this 
increases  the  number  of  unknowns  to  six.  Linearity  of  the 
relation  between  magnetization  and  tl.  ibserved  magnetic 
anomalies  permits  parameter  estimation  by  a  least  squares 
solution  of  the  overdetermined  equations  connecting  the 
model  to  the  data.  The  underlying  idea  is  that  any 
discrepancy  between  the  predictions  of  the  model  and  the 
observed  anomalies  originates  from  uncorrected  diurnal 
variations,  crustal  magnetic  fields,  and  so  on  that  may  be 
treated  as  random,  independent  noise  sources.  Under 
these  circumstances  the  Gauss-Markov  theorem  justifies 
the  application  of  the  least  squares  formulation  by  its 
assertion  that  the  true  uniform  magnetization  will  be 
recovered  if  enough  data  are  used.  Yet  when  the 
predicted  field  is  subtracted  from  the  observed  one.  the 
residual  never  has  the  form  of  a  random,  uncorrelated 
noise  signal  as  it  must  if  the  conditions  of  the  theorem  are 
to  be  satisfied.  There  are  systematic,  large-scale  residual 
fields  concentrated  around  the  seamount  that  remain 
unaccounted  for  by  the  uniform  model.  This  may  happen 
even  when  the  seamount  exhibits  an  anomaly  of  the  sim¬ 
plest  form,  with  just  one  maximum,  or  when  it  is  situated 
on  crust  so  young  that  no  reversal  can  i.ave  occurred  in 
the  history  of  the  body.  It  has  long  been  recognized  that 
the  pattern  in  the  anomaly  residual  is  indisputable  evi¬ 
dence  for  a  more  complex  internal  magnetization. 

Several  investigators  have  inferred  variations  in  the 
strength  of  internal  magnetization  using  the  pattern  of 
magnetic  field  anomaly  residuals.  Richards  el  al.  [1967] 


and  Harrison  [1971]  observed  that  short-wavelength  resi¬ 
duals  centered  over  the  seamount  top  could  be  an  indica¬ 
tion  of  nonmagnetic  rocks  capping  the  seamount.  They 
modeled  this  by  eliminating  the  uppermost  bathymetric 
layers  from  the  model  and  were  able  to  increase  the 
correspondence  between  the  model  predictions  and  the 
observations.  Emilia  and  Massey  [1974]  confirmed  this 
result  by  allowing  their  inversion  program  to  vary  the 
magnetization  amplitude  of  the  model  for  each  layer  in  the 
seamount,  although  they  found  their  method  was  unstable 
if  too  many  independent  layers  were  used.  Schimke  and 
Bufe  [1968]  obtained  a  magnetization  for  the  cap  of  Chau¬ 
tauqua  Seamount  by  inverting  the  residual  anomaly  calcu¬ 
lated  for  the  whole  seamount.  The  sum  of  the  cap  and 
the  whole  seamount  magnetization  indicated  that  the  cap 
may  be  more  weakly  magnetized  than  the  remainder  of 
the  seamount  body  [ Francheteau  el  al.,  1968].  Blakely  and 
Christiansen  [1978]  used  the  pattern  of  magnetic  residuals 
at  the  Mount  Shasta  Volcano  to  delineate  lateral  variations 
in  internal  magnetization.  They  observed  that  the  western 
portion  of  Mount  Shasta  may  have  greater  magnetization 
than  its  eastern  portion  and  concluded  that  nonuniform 
magnetization  could  lead  to  erroneous  paleomagnetic  poles 
using  the  least  squares  method.  Likewise,  Kodama  and 
Uyeda  [1979]  used  magnetic  field  inversion  to  deduced 
that  the  eastern  portion  of  Oshima  Volcano  may  have 
lower  magnetization  than  the  rest  of  the  body.  To  explain 
this  pattern,  they  presented  geological  evidence  for  an 
older  volcanic  edifice  hidden  beneath  the  eastern  part  of 
the  volcano. 

Other  workers  have  attempted  to  account  for  seamount 
magnetic  anomaly  residuals  by  assuming  that  portions  of 
the  seamount  contain  both  normal  and  reversely  magne¬ 
tized  rocks.  Sager  el  al.  [1982]  assumed  the  lop  kilometer 
of  Nagata  Seamount  was  of  reversed  polarity,  opposite  in 
direction  and  equal  in  magnitude  to  the  remainder  of  the 
body.  Th  y  divided  the  body  into  normal  and  reversed 
sections  by  introducing  a  negative  volume  for  the  assumed 
reversed  portion  of  the  bodv,  resulting  in  an  improved 
goodness  of  fit  between  the  calculated  and  the  observed 
anomalies.  Likewise,  for  a  collection  of  seamounts  on  the 
Cocos  plate,  McNutt  [1986]  used  a  modified  least  squares 
method  allowing  for  solution  of  up  to  nine  distinct  regions 
of  magnetization.  The  number  and  the  location  of  the 
magnetically  distinct  regions  were  specified  before  inver¬ 
sion,  and  in  two  cases  the  seamounts  appeared  to  have 
regions  of  both  normal  and  reversed  polarity  Naturally, 
including  more  degrees  of  freedom  in  the  models 
improves  the  fit.  but  the  significance  of  any  conclusions 
obtained  is  questionable  in  view  of  the  arbitrariness 
involved  in  the  subdivision  procedure. 

Another  approach  to  removing  the  effect  of  nonuniform 
magnetization  is  to  smooth  the  magnetic  field  anomaly 
before  attempting  inversion  for  the  magnetization.  This 
approach  has  been  used  for  complicated  magnetic 
anomalies  where  short-wavelength  components  may  be 
imposed  on  a  longer-wavelength  anomaly.  The  justifying 
assumption  for  smoothing  is  that  the  volume  of  rocks 
creating  the  short  wavelength  anomalies  is  small  in  com¬ 
parison  to  the  volume  creating  the  long-wavelength  anom¬ 
aly  Mites  and  Roberts  [1981]  used  an  orthogonal  profile 
technique  to  smooth  the  magnetic  anomaly  of  Rosemary 
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Bank  Seamount  before  inverting  for  the  least  squares 
magnetization.  Sager  [1984]  used  an  upward  continuation 
algorithm  to  smooth  the  magnetic  field  anomaly  of  Abbott 
Seamount.  He  argued  that  the  advantage  of  upward  con¬ 
tinuation  is  that  it  is  physically  equivalent  to  increasing  the 
height  of  the  plane  of  the  magnetic  observations,  and 
therefore  it  gives  a  valid  estimate  of  the  magnetization 
amplitude.  Gardner  et  a!.  [1984]  used  the  same  procedure 
to  continue  upward  the  magnetic  anomaly  of  Shimada 
Seamount  by  4  km  before  attempting  inversion  with  the 
least  squares  method.  A  set  of  field  data  is  not  associated 
with  a  unique  upward  continuation,  and  so  numerical 
upward  continuation  involves  its  own  (often  unstated) 
assumptions  about  the  true  magnetic  field.  The 
differences  between  the  various  upward  continued  ver¬ 
sions  of  the  data  correspond  exactly  to  the  differences  in 
the  ascribed  magnetizations  of  the  various  models;  the 
inherent  ambiguity  cannot  be  avoided  by  this  kind  of 
preprocessing. 

It  is  clear  that  the  presence  of  mixed  polarity  in  a 
seamount  may  be  a  source  of  error  in  the  calculation  of  its 
paleomagnetic  pole.  To  demonstrate  this,  Lumb  et  al. 
[1973]  constructed  the  synthetic  magnetic  anomaly  for  a 
seamount  model  with  mixed  polarity  and  showed  that  an 
inaccurate  estimate  of  the  paleopole  was  obtained  if  a 
homogeneous  magnetization  was  assumed.  They  conse¬ 
quently  used  mixed  polarity  magnetization  to  explain  the 
scatter  of  paleopoles  obtained  from  inversion  of  the  mag¬ 
netic  anomalies  of  the  Cook  Islands.  In  contrast,  Sager 
[1983]  argued  that  seamounts  of  mixed  polarity  may  yield 
accurate  paleopoles  if  one  polarity  clearly  dominates  the 
body  and  the  effect  of  the  other  polarity  is  removed  by 
upward  continuation.  This  is  probably  true  for  Abbott 
Seamount  [Sager,  1984]  because  of  the  agreement 
between  its  magnetic  paleolalilude  and  the  latitude  of  the 
Hawaiian  hot  spot.  However,  there  is  a  significant 
discrepancy  between  the  paleopole  calculated  for  Shimada 
Seamount  [Gardner  et  al.,  1984]  and  the  pole  position  near 
the  north  pole  implied  by  its  young  age.  There  are  several 
cases  of  disagreement  between  the  magnetization  inferred 
from  least  squares  magnetic  field  inversion  and  the  mag¬ 
netization  measured  from  rock  samples.  At  Cobb 
Seamount,  Merrill  and  Burns  [1972]  reported  difficulty  in 
reconciling  the  paleopole  obtained  from  magnetic  field 
inversion  and  that  measured  from  summit  rock  samples. 
Similarly,  the  magnetic  field  inversion  for  Suiko  Seamount 
( Kndama  el  al. ,  1978]  yields  a  paleolatitude  significantly 
different  from  that  measured  for  rocks  obtained  from 
Suiko  during  Deep  Sea  Drilling  Project  leg  55  [Kano, 
1977]  Similarly,  the  magnetic  field  inversion  for  the 
Oshima  Volcano  [Kodama  and  Uyeda,  1979]  provides  a 
magnetization  direction  that  differs  from  paleomagnetic 
measurements  taken  on  surface  volcanic  rocks. 

These  inconsistencies  point  to  the  need  for  a  more  gen¬ 
eral  magnetization  model  The  fundamental  difficulty  fac¬ 
ing  anyone  who  wishes  to  introduce  a  more  complex  struc¬ 
ture  is  the  nonuniqueness  of  the  inverse  problem  Even 
when  a  magnetic  field  caused  by  internal  magnetization  is 
known  exactly  at  every  point  outside  the  seamount,  there 
are  infinitely  many  other  magnetizations  generating  pre¬ 
cisely  the  same  exterior  field  To  get  some  idea  of  how 
large  a  family  of  models  is  compatible  with  every  exterior 
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field,  consider  /,  an  arbitrary  continuously  differentiable 
function,  vanishing  on  3  V ,  the  boundary  of  the  seamount 
If  a  magnetization  vector  is  defined  by  M  =  V/,  it  is 
easily  shown  by  application  of  Gauss’  theorem  that  the 
exterior  field  associated  with  M  is  identically  zero.  Thus  if 
M0  generates  a  particular  exterior  field,  M  +  Mo  will  cause 
.an  identical  one  for  every  /  of  the  specified  form.  There¬ 
fore,  on  the  basis  of  the  magnetic  field  data  alone,  it  is 
impossible  to  distinguish  between  an  enormous  variety  of 
different  models.  To  overcome  this  basic  ambiguity,  some 
restriction  must  be  introduced  from  our  knowledge  of 
geology  and  geophysics  to  limit  the  amount  of  variability. 
This  is  just  what  the  uniform  magnetization  assumption 
does  in  a  heavy-handed  way. 

Because  the  magnetic  anomaly  data  are  consistent  with 
magnetizations  of  infinite  complexity,  we  must  try  to 
avoid  being  misled  into  believing  some  accidental  feature 
of  a  model  is  truly  demanded  in  the  solution.  Our 
approach  in  this  paper  is  to  construct  the  magnetization 
model  matching  the  data  that  is  as  close  as  possible  to  a 
uniform  model.  The  details  of  what  we  mean  by  "close¬ 
ness?’  and  the  techniques  fur  achieving  the  desired  objec¬ 
tive  will  be  discussed  in  the  next  section.  Having  found 
the  model  with  the  minimum  amount  of  nonuniformity, 
we  know  the  true  internal  magnetization  must  possess  that 
degree  of  nonuniformity,  or  more.  This  is  useful  informa¬ 
tion  because  it  tells  us  something  about  the  complexity  of 
the  body. 

Nonuniform  seamount  magnetization  may  be  produced 
by  several  factors:  (1)  the  duration  of  seamount  volcan¬ 
ism,  (2)  the  variety  of  seamount  rock  types  with  different 
magnetization  characteristics,  and  (3)  the  structural  com¬ 
plexity  of  seamounts.  The  duration  of  seamount  volcan- 
ism  is  not  well  known,  but  estimates  range  from  as  short 
as  a  few  hundred  thousand  years  [Duncan  and  McDougall, 
1976]  to  as  long  as  10  or  20  million  years  [Menard.  1964; 
McDougall  and  Schmincke ,  1976].  These  time  spans  are 
long  in  comparison  to  the  time  for  secular  variation  of  the 
geomagnetic  field,  implying  that  the  magnetization  of  indi¬ 
vidual  seamount  lava  flows  may  be  deflected  by  several 
degrees  but  that  the  average  magnetization  will  represent 
an  axial  dipole  field.  The  duration  of  seamount  volcanism 
is  short  in  comparison  to  the  frequency  of  geomagnetic 
field  reversals  during  the  Cretaceous  [Kent  and  Gradstein, 
1985]  but  is  long  in  compaison  to  the  frequency  of  rever¬ 
sals  during  the  last  5  million  years  [Lowrie  and  Kent, 
1983]  The  probability  of  spanning  a  field  reversal  during 
construction  is  therefore  higher  for  Tertiary  seamounts 
than  for  Cretaceous  seamounts,  and  this  allows  Cretaceous 
seamounts  to  be  more  easily  modeled  Seamounts  with 
episodic  or  post  erosional  volcanism  may  be  constructed  of 
rocks  with  imprints  from  geomagnetic  fields  of  different 
periods  and  locations.  For  example.  Rice  et  al.  [1980] 
reported  that  as  much  as  32%  of  Bermuda  is  made  of 
mid-Tertiary  sills  that  were  intruded  into  a  Cretaceous 
edifice.  Additionally,  for  the  southern  Line  Islands  it  was 
reported  that  both  Eocene  and  Late  Cretaceous  volcanism 
are  present  within  the  seamount  edifices  [Haggerty  et  al., 
1982],  Seamount  nonuniform  magnetization  may  also 
result  from  the  variety  of  rock  types  involved  in  their  con¬ 
struction.  Sean"’unt  rocks  such  as  hyaloelastites,  pillow 
lavas,  dikes,  anu  gabbros  may  differ  significantly  in  their 
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magnetic  properties.  Hyaloclastites  are  relatively  nonmag¬ 
netic  rocks  composed  of  ash,  sand,  and  broken  pillow 
rinds  associated  with  explosive  underwater  volcanism 
[McBirney,  1971;  Lonsdale  and  Batiza ,  1980].  Harrison 
[1971]  proposed  that  the  weakly  magnetized  top  of  a 
seamount  may  be  composed  of  hyaloclastites.  Likewise, 
Harrison  and  Ball  [1975]  observed  low  magnetization  at  an 
exposed  seamount  composed  predominantly  of  hyaloclas- 
tite  tuff.  Higher  magnetization  will  be  present  in 
seamount  basalts,  such  as  pillow  lavas,  and  dikes  as  well 
as  in  seamount  gabbros.  It  is  not  known  what  fractions  of 
the  seamount  bodies  are  composed  of  hyaloclastites  rela¬ 
tive  to  basalts  or  gabbros.  However,  it  has  been  put  for¬ 
ward  that  large  amounts  of  these  materials  may  be  found 
on  the  flanks  [ Lonsdale  and  Batiza,  1980]  and  summits 
[Batiza  et  a!.,  1984]  of  seamounts.  Structural  complexity 
may  be  another  factor  leading  to  nonuniform  seamount 
magnetization.  Seamounts  may  contain  large  magma 
chambers  or  conduits,  and  these  bodies  may  require  a  few 
hundred  thousand  years  to  cool  because  of  the  low  ther¬ 
mal  conductivity  typical  of  basalt  [ Grossling ,  1970].  Such  a 
body  may  partially  cool  in  a  polarity  interval  opposite  to 
the  rest  of  the  seamount  or  may  record  magnetization 
changes  due  to  geomagnetic  secular  variation  or  grain  size 
variation.  It  has  been  proposed  that  tilting  of  the  flanks  of 
seamounts  may  occur  as  a  result  of  inflation  of  magma 
chambers  [Staudigel  and  Schmincke,  1984]  resulting  in 
changes  in  magnetization  inclination.  Likewise,  dike  and 
rift  zones  may  have  distinct  magnetic  signatures,  and  they 
are  particularly  prominent  features  on  large  seamounts  and 
guyots. 

The  existence  of  a  large  family  of  alternative  models 
each  capable  of  exactly  matching  the  magnetic  anomaly 
certainly  means  that  it  is  impossible  to  obtain  an  exact 
description  of  the  interior  magnetization  from  these  data. 
Furthermore,  the  data  will  usually  not  allow  us  to  decide 
unambiguously  between  competing  geological  speculations. 
Therefore  it  is  important  to  identify  any  features  of  a 
model  that  can  in  principle  be  strictly  related  to  the  mag¬ 
netic  anomaly.  The  dipole  moment  of  the  seamount  can 
be  computed  from  its  exterior  field  if  this  is  exactly 
known.  The  dipole  moment  is  especially  useful  geophysi¬ 
cally  because,  after  division  by  the  volume  of  the 
seamount,  it  is  the  vector  of  average  magnetization.  This 
is  the  vector  most  diagnostic  of  the  mean  geomagnetic 
field  direction  during  the  formation  of  the  seamount,  and 
so  it  is  a  most  important  quantity  for  paleomagnetic 
research.  As  it  happens,  when  we  compute  the  most 
nearly  uniform  magnetization,  the  mean  magnetization  is 
automatically  separated  from  the  nonuniform  remainder; 
it  is  then  easy  to  find  the  paleomagnetic  pole  associated 
with  the  most  uniform  seamount.  The  actual  dipole 
moment  and  our  estimate  will  differ  because  the  magnetic 
data  are  incomplete  and  imprecise  A  key  question  is. 
How  far  can  the  true  mean  magnetization  differ  from  the 
vector  associated  with  our  model'’  We  develop  a  theory  to 
provide  the  answer.  We  require  an  upper  limit  on  the 
intensity  of  magnetization  of  the  rocks  of  the  seamount, 
otherwise,  the  actual  difference  between  the  actual  mean 
magnetization  and  the  uniform  vector  may  be  arbitrarily 
'arge  Preliminary  calculations  with  the  theory  indicate 
that  more  work  needs  to  be  done  on  this  topic  because  the 


estimated  uncertainties  remain  disappointingly  large.  The 
focus  of  our  current  theoretical  research  is  the  refinement 
of  the  bounds  on  the  uncertainty. 

The  plan  of  this  paper  is  as  follows.  Section  2  gives  the 
mathematical  details  concerning  the  construction  of  a  most 
nearly  uniform  magnetization.  For  this  problem  we  have 
chosen  a  Hilbert  space  setting  in  which  the  norm  of  the 
space  is  proportional  to  the  rms  magnetization.  In  this 
space  the  distance  between  two  models  is  the  norm  of 
their  difference.  We  decompose  an  arbitrary  magnetiza¬ 
tion  into  two  parts:  a  uniform  magnetization  (a  vector  of 
constant  size  and  magnitude  at  every  interior  point  of  the 
seamount)  and  another,  nonuniform  part  that  may  vary  in 
magnitude  and  direction.  The  model  we  seek  is  the  one 
that  has  the  smallest  nonuniform  component  and  satisfies 
the  measurements  of  the  magnetic  field  anomaly.  The 
norm  of  the  nonuniform  portion  is  a  seminorm  of  the 
magnetization  in  the  language  of  functional  analysis,  and 
so  we  call  the  modeling  process  seminorm  minimization  in 
contrast  with  many  geophysical  inversion  techniques  which 
are  model  norm  minimizations.  Although  it  is  always  pos¬ 
sible  in  principle  to  obtain  exact  agreement  between  the 
predictions  of  the  model  and  the  measurements,  we 
should  allow  for  misfit  because  of  noise  in  the  measure¬ 
ments  and  approximations  in  the  theory.  Section  3  of  the 
paper  deals  with  the  various  approximations  necessitated 
by  practical  calculation  and  measurement.  The  shape  of 
the  seamount  cannot  be  represented  exactly  in  any  actual 
computation,  so  we  have  chosen  an  approximation  for  it 
in  terms  of  an  enclosing  set  of  triangular  facets  on  a  flat 
base.  We  estimate  the  magnitude  of  the  errors  introduced 
by  this  approximation  and  show  how  they  may  be  kept 
well  below  the  uncertainties  associated  with  the  magnetic 
observations.  To  carry  out  the  theory  of  section  2,  a  large 
number  of  volume  integrals  must  be  carried  out  over  the 
seamount.  Even  with  our  simplified  body  those  integrals 
cannot  be  performed  in  closed  form,  and  therefore  we 
adopt  a  scheme  for  numerical  approximation.  Here  one  of 
the  advantages  of  our  particular  Hilbert  space  formulation 
becomes  evident:  the  volume  integrals  can  be  transformed 
into  surface  integrals  by  means  of  Gauss'  theorem. 
Despite  this  the  numerical  work  in  obtaining  the  necessary 
accuracy  is  great;  we  describe  efficient  numerical  processes 
for  computing  the  surface  integrals.  Section  4  treats  the 
question  of  estimating  the  uncertainty  in  the  uniform  part 
of  the  magnetization  model.  We  show  how  a  knowledge 
of  the  maximum  permissible  intensity  of  magnetization 
can  be  converted  into  a  bound  on  the  uncertainty  in  the 
average  magnetization.  In  section  5  the  theory  is  applied 
to  a  seamount  in  the  South  Pacific  Ocean  on  the  Louisville 
Ridge  seamount  chain.  Ordinary  least  squares  modeling 
of  this  seamount  is  unsatisfactory  in  two  ways.  First,  the 
predicted  anomaly  has  the  wrong  shape  and  magnitude, 
resulting  in  an  rms  misfit  of  269  nT  to  an  anomaly  with 
rms  magnitude  of  less  than  600  nT.  Second,  the  calcu¬ 
lated  paleomagnetic  pole  position  is  more  than  30°  from 
the  north  geographic  pole,  a  displacement  most  improb¬ 
able  for  a  young  seamount,  as  this  one  is  by  the  evidence 
of  radiometric  dating  and  its  position  in  the  Louisville 
chain  Application  of  our  method  overcomes  both 
deficiencies:  from  magnetic  field  measurements  on  the 
approach  to  the  seamount  we  estimate  that  the  local 
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crustal  fields  contribute  about  30  nT  to  an  anomaly  with  a 
peak  magnitude  of  over  1200  nT;  we  find  the  most  nearly 
uniform  model  with  this  rms  misfit.  Its  pole  position  is 
within  7°  of  the  geographic  north  pole.  The  uncertainty 
ascribed  to  the  pole  position  by  the  error  theory  appears  to 
be  much  too  large,  and  reasons  for  this  discrepancy  are 
put  forward. 

Most  readers  will  not  require  a  deep  understanding  of 
the  mathematical  derivations  or  of  the  numerical  tech¬ 
niques  that  lie  behind  a  successful  application  of  the 
theory.  To  them  we  suggest  the  following  strategy:  first, 
go  quickly  through  the  next  section  to  get  an  idea  of  the 
theoretical  framework;  then  skip  to  section  5  where  an 
actual  magnetic  survey  is  analyzed  with  the  method.  In 
the  application  we  have  included  signposts  to  the  earlier 
material  in  the  event  that  the  reader  wishes  to  follow  up 
any  particular  point  in  greater  detail. 


2.  Theory  I:  Finding  A  Model 

This  section  explains  how  to  find  the  most  nearly  uni¬ 
form  magnetization  within  a  seamount  consistent  with 
magnetic  field  data  measured  in  its  vicinity.  The  funda¬ 
mental  geological  assumptions  are  that  the  seamount  was 
formed  by  the  outpouring  of  lavas  onto  a  previously  exist¬ 
ing,  relatively  level  crust  and  that  the  new  material  did  not 
cause  the  older  crust  below  to  become  strongly  remagnet¬ 
ized.  Our  model  seamount  does  not  have  large  magnetic 
"roots/',  instead,  the  significant  magnetic  sources  lie  above 
the  level  of  the  surrounding  seafloor.  Naturally,  the  mag¬ 
netic  material  around  the  seamount  and  under  ii  must 
contribute  to  the  measured  fields  at  the  sea  surface. 
These  signals  are  noise  as  far  as  we  are  concerned,  and  we 
allow  for  them  by  permitting  mismatch  between  the  pred¬ 
ictions  of  the  model  and  the  observations. 

The  first  task  is  to  solve  the  forward  problem,  that  is, 
the  calculation  of  the  predicted  magnetic  fields  from  a 
known  model  of  magnetization.  Almost  all  marine  mag¬ 
netic  measurements  are  of  the  total  field  intensity  |B|. 
which  we  shall  assume  have  been  reduced  to  total  field 
anomalies  by  subtraction  of  a  local,  total  field  computed 
from  a  global  field  model  As  we  have  already  noted  in 
the  introduction,  it  may  be  necessary  to  include  in  our 
model  parameters  describing  the  variation  of  the  global 
field  over  the  survey  area.  Because  the  contribution  of  the 
seamount  to  the  total  observed  field  is  small,  the  resultant 
anomaly  is  well  approximated  by 


A|B|  =  AB 

where  fi„  is  a  unit  vector  in  the  direction  of  the  ambient 
field  at  the  site,  and  AB  is  the  field  vector  associated  with 
local  sources  Suppose  for  the  moment  that  the  entire 
anomaly  is  caused  by  the  magnetization  of  the  seamount: 
then  the  anomaly  found  at  r.  the  position  of  an  observer, 
is  just 

A  |B  (r  )  I  =  f  Gtr.s)  Mts)  </N  (I) 

*a 

where  Mis  I  is  the  magnetization  vector  at  a  point  s  <  I 
in  the  body  and  Gtr.s)  is  Green's  function  lor  the  prob¬ 
lem.  namely. 


G(r,  s)  = 

4w 


V,  Vr 


Mu 

47T 


3  (r  -  s  )6(,  ■  (r  -  s ) 


(2) 


This  function  gives  the  field  component  at  r  in  the  direc¬ 
tion  of  80  owing  to  an  elementary  dipole  at  s.  Let  us 
recognize  explicitly  in  the  notation  the  important  fact  that 
measurements  are  obtained  at  only  a  finite  number  of 
places  r,,  r2 . r.\  .  We  simplify  (1)  as  follows: 

d,  —  J  G,  (s)  •  M(s)  t/*s  j  =  1,2 . ,V  (3) 


HereG,(s)  stands  forG(r(,s),  and  d,  is  an  abbreviation 
for  the  7th  datum,  A  |B (r, ) | .  Equations  (2)  and  (3)  con¬ 
stitute  a  complete  formal  solution  to  the  forward  problem: 
a  practical  solution  requires  in  addition  efficient  numerical 
procedures  for  the  evaluation  of  the  volume  integrals  over 
the  complicated  shape  of  the  seamount.  This  question  is 
deferred  until  section  3.  because  we  shall  need  to  evaluate 
other  more  involved  functions  over  the  same  domain  to 
solve  the  inverse  problem. 

At  this  point  we  introduce  the  notion  of  distance 
between  two  models  of  magnetization,  so  that  there  is  a 
definite  meaning  to  the  idea  that  some  models  are  closer 
together  than  others.  A  natural  mathematical  setting  for 
this  discussion  is  a  normed  linear  vector  space  X  contain¬ 
ing  as  elements  all  the  magnetization  functions  that  might 
occur  inside  I':  any  magnetization  M  is  a  single  element  in 
.V  The  distance  between  any  two  elements  M  and  N  of  X 
is  the  norm  of  their  difference  IlM  -  Nil.  Equation  (31  is 
interpreted  as  saying  that  each  observation  is  given  by  a 
linear  functional  of  M.  There  are  several  normed  vector 
spaces  that  might  be  suitable  in  this  context.  In  the  study 
of  marine  magnetic  anomalies,  it  is  traditional  to  reduce 
the  vector-valued  magnetization  to  a  scalar  function  of 
position  times  a  constant  unit  vector,  in  other  words,  to 
consider  only  magnetizations  with  constant  direction.  One 
might  at  first  suppose  that  the  restriction  to  unidirectional 
magnetization  models  might  make  it  impossible  to  fit  the 
data  properly,  particularly  if  the  "wrong"  direction  were 
chosen,  but  it  can  be  proved  that  such  models  are  capable 
of  satisfying  any  finite  data  set.  no  matter  what  direction  is 
used.  The  proof  follows  from  the  linear  independence  of 
the  associated  representers,  something  established  by  the 
methods  used  in  appendix  A  Nonetheless,  we  believe  it 
is  important  not  to  make  restrictive  assumptions  about  the 
magnetization  of  the  seamount,  and  so  we  employ  a  space 
that  allows  complete  freedom  for  the  magnetization  func¬ 
tions  that  are  its  elements 

Parker  1)971)  proposed  the  use  in  this  problem  of  a  Hil¬ 
bert  space,  which  we  shall  call  P  here  elements  are 
vector- valued  functions  of  position  s  6  f  .  for  example, 
magnetizations  Technically  ,  an  element  of  P  is  a  certain 
equivalence  class  of  functions  brought  into  being  by  the 
completion  of  the  space,  we  shall  not  dwell  on  these 
matters  here  The  inner  product  of  the  space  is 

(M.  N)  -  f  MM  Nisi  d\  (41 


The  norm  of  an  element  is 
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I! Mil  =  (M,  M) 

|M(s )  |:  dh 

If  IlMlI  is  normalized  by  ihe  square  root  of  the  volume  of 
the  seamount,  it  is  just  the  rms  magnetization.  There  are 
a  number  of  reasons  why  a  Hilbert  space  is  a  convenient 
choice  for  the  linear  vector  space;  the  principal  one  is  that 
optimization  problems  (minimization  of  a  norm,  for 
example)  are  particularly  simple  in  such  spaces  because 
they  have  unique  solutions,  linearly  related  to  the  data  in 
many  cases. 

Let  us  first  assume  that  the  data  </t.  d2 . <A  are  to  be 

satisfied  exactly  and  that  the  required  uniform  magnetiza¬ 
tion  is  already  known,  we  call  it  U  £  P  Notice  that  l  is 
not  simply  a  vector  in  ordinary  space;  it  is  a  vector-valued 
function  throughout  the  region  L  that  is  constant  in  mag¬ 
nitude  and  direction  at  every  pomt.  Equation  (4)  suggests 
that  we  can  write  (3)  as  an  inner  product: 

</,  =  (G,.M>  ,/=1.2 . V  (5) 

This  is  possible  because  for  observations  outside  I  .  every 
G  has  a  bounded  norm  and  is  therefore  a  valid  element 
of  P  Such  elements  are  called  represented  in  the 
mathematical  literature,  a  name  preferred  by  the  authors 
to  the  geophysical  term  "data  kernels."  Stated  in  the  con¬ 
text  of  the  Hilbert  space  P.  our  problem  can  be  posed  as 
follows:  we  must  find  M  satisfying  (5)  such  that  IIM  -  lill 
is  as  small  as  possible  This  corresponds  to  making  our 
model  as  close  to  a  particular  uniform  model  as  it  can  be, 
in  the  sense  of  the  norm  of  P .  To  solve  the  problem, 
write  the  nonuniform  part  of  the  magnetization  as  R: 

R  =  M  -  l  (6) 

and  take  the  inner  product  with  G  ;  from  (5)  we  have 

(/,  -  (G, .  L  )  =  (G  .  R)  7=1,2 . V  (7) 

Since  everything  on  the  left  is  known,  the  problem  is  to 
find  the  element  R  of  smallest  norm  obeying  a  given  finite 
collection  of  inner  product  constraints.  The  solution  to 
this  kind  of  optimization  problem  has  appeared  in  the  geo¬ 
physical  literature  many  times  [e  g  .  Backus,  1970;  Parker. 
1977),  For  completeness  we  sketch  the  derivation  here. 
We  show  in  Appendix  A  that  the  representers  G, .  where 

./  =  I.  2 .  V.  are  linearly  independent;  therefore  they 

form  the  basis  for  an  V -dimensional,  and  therefore 
closed,  subspace  of  P,  which  we  call  G.  The  decomposi¬ 
tion  theorem  for  Hilbert  spaces  [ Luenbcrger ,  1969)  states 
that  any  element  of  P  may  be  written  as  a  sum  of  two 
parts,  one  lying  in  G  and  the  other  in  G- .  with  the  suh- 
space  of  elements  orthogonal  to  the  elements  of  G  called 
the  orthogonal  complement  of  G  We  decompose  R  in 
this  way: 

R=S  +  T  (8) 

where  S  (  C  and  T  €  G‘ ;  obviously.  (S.T)  =  0,  and 
t hen  it  follows  that 

IIRII-1  =  IISII-’  +  IITII:  (9) 

When  ( 8 )  is  substituted  into  (7)  we  see  that  only  the  S 
component  of  R  affects  the  fit  to  the  data,  because 


(Gj ,  T)  =  0  for  all  y ;  hence  we  can  choose  the  part  of  R 
lying  in  G1  at  will.  Equation  (9)  shows  us  that  we  should 
make  T  the  zero  element  of  P,  for  this  gives  us  the  smal¬ 
lest  norm  of  all.  Now  we  must  adjust  S  to  obtain  agree¬ 
ment  with  the  data.  Because  the  elements  G,  form  a  basis 
for  G, 

\ 

S  =  I  «,  G 

i 

We  have  concluded  that  T  =  0,  and  we  know  from  (8) 
that  R  =  S;  from  (6).  M  =  L  +  S.  and  so  the  magnetiza¬ 
tion  nearest  l  satisfying  the  data  is 

s 

M„  =  L  +  £  «,  G,  (10) 

i- 1 

The  decomposition  theorem  has  reduced  the  problem  of 
finding  an  optimum  element  from  a  search  in  an  infinite¬ 
dimensional  space  to  a  problem  in  a  finite  number  of 
unknowns.  All  that  remains  to  be  done  is  to  find  the 
expansion  coefficients  <*  .  and  the  model  nearest  L'  has 
been  found.  This  is  accomplished  by  substituting  (10) 
into  (5):  we  obtain  the  system  of  linear  equations 
\ 

L  r*  «,  =  (/*'  k  =  l,  2 . v 

/■  i 

where 

d;=  d,  -  <g  .  i: ) 

and 

r,*-(G,.GJ  (ID 

The  matrix  T  of  all  possible  pairs  of  inner  products  of  the 
represenlers  is  fundamental  in  much  of  the  theory;  it  is 
called  the  Gram  matrix.  The  representers  are  linearly 
independent,  from  which  it  follows  that  the  Gram  matrix 
is  nonsingular  [Luenberger,  19691.  and  so  there  is  a  unique 
solution  to  the  linear  system  for  the  a,. 

In  reality  we  do  not  know  the  direction  nor  the  intensity 
of  the  uniform  magnetization  that  best  approximates  the 
interior  magnetization:  determination  of  U  is  one  of  the 
most  important  goals  of  our  investigation  Also  we  must 
not  demand  precise  agreement  between  the  predictions  of 
the  theory  and  the  observations.  The  complete  solution  to 
the  problem  will  be  developed  in  two  stages:  first  we  admit 
VJ  to  be  unknown;  then  we  allow  misfit 

To  determine  the  unknown  U,  imagine  making  a  guess 
for  that  elemem,  solving  (7)  for  the  smallest  R.  and  then 
repeating  the  process  for  a  series  of  different  guesses. 
Clearly  the  best  solution  of  the  series  would  be  the  one 
that  causes  R  to  be  smallest,  for  then  it  would  be  the 
nearest  one  to  some  uniform  model  in  the  set  of  guess 
models.  To  solve  the  general  problem,  we  analyze  this 
hypothetical  optimization  problem  over  the  space  of  all 
possible  elements  1).  In  fact  U  belongs  to  a  three- 
dimensional  subspace  of  P.  because  any  such  clement  can 
be  written 

C  =  fi  |X  i  +  (i:X:  +  /3,X, 

where  X,.  X,.  and  Xi  are  fixed  elements  of  P  representing 
uniform  magnetizations  of  unit  intensity  in  three  mutually 
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perpendicular  directions.  From  (10)  the  most  nearly  uni¬ 
form  solution  takes  the  form 

+  (12) 

n- 1  (-1 

We  now  consider  both  the  a  and  p  coefficients  to  be  free 
in  constructing  a  model  that  fits  the  data: 

«G,.[£/».X„  +  Ia1Gt])-4  (13) 

n-l  t-t 

But  in  addition  we  want  to  minimize  the  distance  squared 
between  U  and  M: 

IIM  —  Ull*  =  ([£a,Gj,  [£  «tGj)  (14) 

./-i  r-i 

These  expressions  are  more  easily  grasped  if  we  take  all 
the  sums  outside  the  inner  products  and  introduce  matrix 
notation: 

Afi  +  Ya^d  (13') 

IIM  —  Ull ’  =  af  F«  (14') 

Here  T  is  the  Gram  matrix  and  the  meaning  of  the  vectors 
a  €  R' .  fi  €  R1  and  <7  €  R'  is  obvious;  the  components 
of  the  .V  by  3  matrix  A  are  given  by 

Ajn  =  (G , .  X  „ ) 

The  matrix  ,4  contains  the  solution  to  the  forward  prob¬ 
lem  for  uniform  seamounts  because  its  elements  are  the 
magnetic  fields  at  the  observation  positions  due  to  unit 
uniform  magnetizations  in  the  three  orthogonal  axis  direc¬ 
tions;  we  call  A  the  Green  matrix.  Only  this  matrix  is 
needed  in  the  conventional  least  squares  fitting  process. 

The  simplest  way  to  minimize  04')  with  03')  as  a  con¬ 
straint  is  to  introduce  a  set  of  .V  Lagrange  multipliers  A , . 

A: . A \ .  which  we  can  collapse  into  the  vector  A  (  R'. 

and  minimize  the  unconstrained  functional 

a '  T  <«  +  A  1  [A  fi  +  V  a  —  d] 

over  the  vectors  a,  fi,  and  A.  The  solution  requires  only 
elementary  calculus;  we  eliminate  A  and  then  solve  a 
linear  system  of  equations  for  the  coefficients,  most  neatly 
written: 


1  A 

VC 

d 

A'  0 

a 

0 

where  O  is  an  3  by  3  matrix  of  zeros  and  0  is  a  3-vector 
of  zeros.  The  necessary  and  sufficient  condition  that  0  5) 
has  a  unique  solution  is  that  the  projections  of  the  ele¬ 
ments  X„  into  the  subspace  G  spanned  by  the 
representers  should  be  linearly  independent  or 
equivalently  that 

((,  .  £  0„X„  1=0  7  =  1,2 . N 

only  if  all  fi„  are  zero.  Unfortunately  this  is  not  true  in 
general,  for  example,  if  all  the  observations  lie  in  the  same 
plane  as  ft,,  and  the  plane  is  a  plane  of  symmetry  of  the 
seamount,  then  the  condition  docs  riot  hold  In  such  a 
geometry,  the  data  contain  insufficient  information  to 
decide  between  members  of  a  subspace  of  uniform  com¬ 


ponents  each  of  which  yields  an  equally  small  HRII.  Naiur- 
ally.  such  highly  symmetric  situations  will  never  arise  in 
practice,  although  the  reader  may  have  guessed  that  we 
stumbled  on  such  a  case  in  tests  with  artificial  models. 

After  (15)  is  solved  the  vectors  a  and  fi  are  put  into 
the  expansion  (equation  (12)),  and  the  desired  most- 
uniform  model  results.  We  call  the  most  nearly  uniform 
solution  M,  and  the  associated  uniform  and  nonuniform 
pieces  U*  and  R,.  From  the  perspective  of  functional 
analysis,  the  quantity  we  are  minimizing  is  the  norm  of 
the  projection  of  the  magnetization  onto  the  orthogonal 
complement  of  the  three-dimensional  subspace  of  uniform 
magnetizations.  Thus  we  are  performing  a  regularization 
of  the  problem  in  which  IIR»II  is  a  seminorm  of  M  [ Luen - 
herger,  1969],  A  seminorm  is  a  functional  possessing  all 
the  properties  of  an  ordinary  norm  save  one:  HR, II  can 
vanish  when  M  is  not  the  zero  element  of  P .  The  optimi¬ 
zation  problem  we  have  solved  is  called  seminorm  minimi¬ 
zation.  One  nice  general  property  is  that  the  part  of  the 
solution  lying  in  the  subspace  penalized  by  the  norm  (the 
nonuniform  magnetization)  is  orthogonal  to  the  other  part 
of  the  seminorm  minimizing  solution  (the  uniform  part). 
Any  portion  of  the  solution  lying  in  the  subspace  spanned 
by  the  X„  is  drawn  from  l',.  and  therefore  it  is  not  found 
in  R,  where  it  would  only  increase  the  seminorm  unneces¬ 
sarily. 

A  final  refinement  to  the  theory  allows  for  some 
disagreement  between  the  predictions  of  our  model  and 
the  observations.  We  determine  the  model  nearest  to  a 
uniform  magnetization  but  fitting  the  data  to  a  precision 
dictated  by  the  amplitude  of  the  noise  in  the  observations. 
For  computational  convenience  we  turn  to  the  Euclidean 
distance  or  two-norm  as  a  measure  of  the  misfit  between 
model  predictions  and  the  data.  The  rms  field  arising 
from  crustal  sources  not  in  the  seamount  and  other 
extraneous  signals  can  be  estimated  by  examining  mag¬ 
netic  data  obtained  in  the  survey  region  but  far  enough 
away  from  the  seamount  for  its  influence  to  be  negligible. 
Another  source  of  uncertainty  arises  from  the  approxima¬ 
tion  of  the  seamount's  shape;  we  shall  treat  this  factor  in 
detail  in  the  next  section. 

The  discrepancy  between  observation  and  model  predic¬ 
tion  should  be  no  more  than  the  magnitude  of  the  overall 
estimated  uncertainty  Therefore  (5)  is  replaced  by 

£[</,-  (G,,M)F<  S’  (16) 

./--I 

where  S/N  is  the  estimated  rms  noise.  This  condition 
insures  that  the  seamount  we  find  will  be  the  most  uni¬ 
form  of  all  those  in  satisfactory  accord  with  the  magnetic 
field  observations. 

The  arguments  given  earlier  using  the  Decomposition 
Theorem  apply  equally  well  here:  M  must  take  the  form 
given  in  (12)  so  that  the  optimization  problem  is  reduced 
to  finding  the  vectors  «  and  fi.  In  terms  of  these  (16) 
becomes 

II A  fi  +  Y  a-  711-  ^  ,S':  (17) 

This  constraint  appears  difficult  because  it  is  an  inequality, 
but  it  is  not  hard  to  show  that  equality  applies  for  any  nor¬ 
mal  data  set  The  John  multiplier  theorem  (see  Smith. 


:-t 
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1974,  chap.  31  provides  the  technique  for  minimizing  a 
functional  subject  to  inequality  constraints  The  idea  is 
almost  identical  to  the  more  familiar  method  of  Lagrange 
multipliers:  the  inequality  constraints  are  appended  to  the 
functional  under  variation.  The  multiplicative  factors 
behave  as  they  do  with  equality  constraints,  but  the  sign 
of  the  multipliers  is  fixed,  and  for  each  one  there  is  an 
additional  "complementary  slackness  condition."  In  this 
particular  problem  there  is  just  one  multiplier, /x  ^  0.  and 
we  minimize  the  functional 

ar  T  or  +  fi\\A  /3  +  I  a  —  rfll*  (18) 

over  all  vectors  a  and  such  that 

M  III/4  0  +  T  a  -  S:1  =  0  (19) 

In  the  complementary  slackness  condition  (equation  (19)), 
either  the  factor  in  brackets  vanishes',  and  then  equality 
applies  in  (17).  or  jx  is  zero.  If  /a  vanishes  it  is  clear  that 
(18)  has  its  only  minimum  at  «  =  0,  since  T  is  positive 
definite.  This  corresponds  to  an  exactly  uniform  magneti¬ 
zation  The  motivation  for  our  theory  is  the  fact  that  a 
totally  uniform  body  cannot  fit  the  magnetic  data  to  the 
necessary  precision,  and  so  normally  p  is  nonzero.  We 
conclude,  therefore,  that  equality  is  achieved  in  (17)  so 
that  (19)  may  be  obeyed  The  conditions  of  the  John 
multiplier  theorem  apply  at  any  local  minimum  of  the 
functional,  but  because  the  norm  is  a  convex  functional, 
and  the  constraint  (17)  constitutes  a  convex  set  of  points, 
any  local  minimum  must  also  be  the  global  minimizer  of 
the  functional  by  a  well-known  property  of  convexity 
[Luenberger,  19691 

It  may  be  helpful  to  interpret  the  minimization  of  (18) 
as  an  intermediate  problem  lying  between  two  extremes: 
the  conventional  least  squares  fitting  by  a  uniform  body 
when  p.  tends  to  zero  and  (he  construction  of  an  exactly 
fitting  model  when  p  becomes  large  In  the  latter  limit, 
the  solution  is  the  one  that  possesses  the  smallest  nonuni¬ 
form  component;  it  has  no  particular  relation  to  the  smal¬ 
lest  norm  model 

To  find  the  minimum,  we  differentiate  (18)  with  respect 
too.  fi .  and  p  in  the  usual  way;  after  some  rearrangement 
the  equations  derived  from  variation  of  «  and  jJ  can  be 
written 


l/p+V  A 

a 

d 

A'  O 

Id 

0 

which  is  a  linear  system  if  p  is  fixed.  In  contrast  to  the 
situation  with  precisely  matching  data,  we  find  that  the 
multiplier  p  cannot  easily  be  eliminated.  To  find  u  we 
must  appeal  to  (17)  taking  the  equality;  a  little  algebra 
simplifies  the  condition  to 

p  ‘  Hall’  =  S1  (21) 

In  outline  the  solution  of  the  problem  proceeds  in  this 
way  the  vector  a  may  be  regarded  as  a  known  function  of 
p  through  the  solution  of  (20)  (assuming  the  symmetry  of 
the  problem  allows  a  unique  solution),  then  (21)  is  a  non¬ 
linear  equation  for^x  which  we  may  write 

/  (M  )  =  -V-’  (22) 

After  some  manipulation  we  can  find  an  explicit  expres¬ 
sion  for  the  derivative  of  F 


~-~-l fxHHcx)1  Ur2  +  r+  A  (A'  HA)  'A'i(Hu) 

dp 

where  the  matrix  H  is  defined  by 

H  =  U/p  +  n  1 

The  positive  definiteness  of  I'  insures  the  same  property 
for  the  factor  in  brackets  when  p  >  0.  from  which  it  fol¬ 
lows  that  dFI  dp  is  always  negative.  Thus  the  solution  to 
(22)  is  unique  if  it  exists.  It  may  be  verified  that  as 
p  —  F(p)  —  0,  which  is  consistent  with  the  idea  that 
this  limit  corresponds  to  the  problem  of  finding  a  magneti¬ 
zation  fitting  the  data  exactly  The  maximum  F  occurs  as 
fx  —  0,  corresponding  to  a  =  0  and  the  problem  of  least 
squares  fitting  a  uniform  model.  The  value  of  p  associ¬ 
ated  with  a  particular  S:  <  F( 0)  can  be  found  by  iteration 
with  Newton's  method,  which  always  converges  if  the 
starting  approximation  is  less  than  the  true  value  of  fx ; 
this  follows  from  dlF!  dp:  >  0,  a  result  requiring  heavy 
algebra  to  obtain.  In  practice,  having  found  a  value  of  p 
that  yields  plausible  misfits  to  the  data,  we  usually  sweep 
through  a  range  of  values  in  its  vicinity  to  examine  the 
different  solutions. 

With  inexact  fitting  there  is  an  interpretation  of  the 
coefficients  a  that  has  no  counterpart  in  the  analysis  of 
precise  data;  the  equations  obtained  upon  variation  of  the 
functional  with  a  can  be  expressed  as 

a  =  /u  f  d  —  (Aft  +  Pa)] 
or 

a,=  Mty  ~  (G,.M)I 

Since  the  term  in  brackets  is  the  discrepancy  between  the 
predictions  of  the  model  and  the  observations,  this  equa¬ 
tion  says  that  the  individual  misfits  to  the  data  are  each 
proportional  to  an  expansion  coefficient  of  R  in  the  basis 
of  representers. 

The  last  matter  to  be  dealt  with  in  this  section  is  the 
inclusion  of  the  corrections  to  the  ambient  field  to  allow 
for  inaccuracies  of  the  main  field  model.  We  can  do  no 
better  than  the  traditional  treatment  and  allow  three 
further  unknown  parameters  that  correct  for  the  presence 
of  a  linear  variation  of  the  ambient  field  over  the  survey 
region  In  place  of  (5)  the  theoretical  prediction  from  the 
model  takes  the  form 

d,  =  >0  +  y  ■  r,  4-  (G, .  M)  j  —  1.2 . ,V 

where  y  is  a  horizontal  vector  (the  gradient  of  the 
ambient  correction)  and  yu  is  the  unknown  offset  of  the 
ambient  field  from  the  main  field  model.  Formally  this  is 
just  an  inner  product  on  another  Hilbert  space  P'  whose 
elements  consist  of  ordered  triples  l/>.q.F]  where  p  is  a 
real  number,  q  is  a  horizontal  vector,  and  F  is  an  element 
in  P  Then  the  associated  inner  product  for  F\  G'  €  P'  is 

(F',  GT  =  ( (/> ;  q ;  Fl,  (s;t;G])' 

=  ps  +  q  t  +  (F .  G ) 

The  development  proceeds  in  exactly  the  same  way  in  the 
new  space  when  the  representers  for  the  anomaly  data  are 
chosen  to  be 


G.'  =  [1;  r  ;  G,  I 
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and  the  origin  of  coordinates  is  in  the  sea  surface,  so  that 
r, ,  the  position  vector  of  the  y  th  observation  point,  is  hor¬ 
izontal.  In  P'  the  completely  uniform  seamount  is  one  of 
the  form  (/> ,  q,  P\X  \  +  £2X2  +  ^3X3  ];  then  all  the  equa¬ 
tions  derived  earlier  apply  without  change.  Now  the 
coefficient  vector  in  the  matrix  equations  fi  lies  in  R6 
while  a  is  unchanged  in  dimension  and  meaning 

3.  Approximations 

In  this  section  we  explain  the  nature  of  the  approxima¬ 
tions  and  show  that  the  errors  introduced  by  them  can  be 
kept  to  acceptably  low  levels  in  comparison  to  other 
sources  of  uncertainty  in  the  data.  The  first  question  we 
address  is  the  calculation  of  the  Gram  matrix  elements 
The  definition  of  the  representers  from  (2),  together  with 
the  inner  product  of  P  from  (4)  and  the  Gram  matrix 
from  (11),  gives  an  explicit  expression  for  an  element  of 
I 


.  B„  •  V  V  7 — - — r  (23) 

lr„-s| 

Here  V  means  Vs.  differentiation  xs  1 1 h  respect  to  s:  we 
have  used  the  fact  that 


Alter  we  have  taken  advantage  of  the  symmetry  ot  T 
there  are  V(.V  +  1 )/ 2  integrals  like  this  to  be  evaluated;  a 
similar  set  of  3.V  integrals  is  required  to  find  the  elements 
of  the  Green  matrix  .4.  Because  V  in  typically  100  or 
more,  many  thousands  of  integrals  must  be  carried  out 
Several  approximations  are  required  in  order  to  calculate 
these  numbers. 

The  region  r  in  (23)  is  the  set  of  points  defining  the 
seamount.  We  cannot  know  the  exact  shape  of  ihe  bot¬ 
tom  boundary  of  t  where  the  newer  lavas  ot  the  volcanic 
body  lie  in  contact  with  the  original  crust  We  approxi¬ 
mate  this  surface  by  a  horizontal  plane  at  the  mean  level 
of  the  surrounding  terrain.  The  upper  surface  ol  the 
seamount  is  known  in  considerable  detail,  but  it  too  muM 
be  approximated  in  our  calculations  because  it  ts  so  com¬ 
plex.  It  is  important  that  the  approximation  can  be 
formed  from  samples  of  the  bathymetry  not  disposed  in 
any  regular  manner  because  we  shall  show  that  the  com- 
putationa'ly  optimal  spacing  of  bathymetry  samples 
depends  directly  on  the  water  depth,  topography  in  shallow 
water  should  be  sampled  more  densely  than  that  in  deep 

fri)  (b)  y*  1  r  t  > 
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piano  itii  y  tri, invui.tr  tessellation  of  (hem  h.iscit  upon  tfuiwut's 
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t  ig  2  Shape  factor  Hi/ l  of  equation  i24)  lor  triangles  two  of 
whose  vertex  angles  are  0  and  in  degrees.  All  possible  man¬ 
gles  are  covered  hv  this  diagram 


water  Our  approach  has  been  to  represent  the  surface  of 
the  seamount  by  a  tessellation  of  triangular  facets;  the 
facets  are  the  plane  interpolations  of  sample  points  drawn 
from  the  bathymetric  data.  As  suggested  by  Figure  1. 
there  are  many  ways  in  which  a  given  arbitrary  collection 
of  points  in  the  plane  may  be  connected  together  to  form 
nonoverlapping  triangles,  tnd  each  of  these  yields  a 
different  interpolation  In  appendix  B  we  show  that  the 
rnis  error  of  the  interpolation  can  be  deduced  from  the 
power  spectrum  of  the  topography  for  young  seafloor. 
Fox  ami  Haws  [1985)  find  that  a  power  law  is  a  good 
description  of  the  spectrum  over  a  large  range  ol  wave 
numbers,  and  as  proved  in  the  appendix,  this  leads  to  the 
following  expression  for  the  rms  interpolation  error  aver¬ 
aged  over  a  triangular  facet  /': 

/)-'  :  =  1  ,’U  U  '  ■'  <-> (  P t  (241 

where  A  is  the  area  of  7  .  (-H  i  t  in  a  tactor  depending  on 
the  triangle's  shape,  and  rp  <  ■  and  /  arc  constants, 
values  computed  from  the  analysis  of  fox  and  Hayes  arc 
Tj  =  2  48  and  i  —  15  5  m  when  /„.  which  is  an  arbitrary 
length  scale,  is  set  l  i  I  km  The  shape  factor  <-M  / T.  given 
by  equation  (Bid),  is  contoured  m  figure  2  for  a  fixed 
area,  the  equilateral  triangle  produces  the  least  error  with 
<-),,,.  =  0  320.  hut  any  triangle  whose  angles  all  exceed  20 
is  associated  with  an  error  only  slightly  larger,  for  then 
<->  ^  0  575  Table  1  gives  the  rms  error  for  equilateral  tri¬ 
angles  of  various  sizes.  Although  serious  errors  will  not 
be  incurred  unless  a  triangle  is  severely  elongated,  the 
analysis  indicates  that  the  cells  of  the  tessellation  should 
be  chosen  to  be  as  nearly  equilateral  as  possible  All 
automatic  procedure  for  doing  this  was  given  by  Ifittwti 
[1982).  the  method  ’s  based  upon  a  theorem  of  (-1 
Delaunay  stating  that  a  triangular  tessellation  may  be 
arranged  so  that  the  circumcircle  of  every  triangle  contains 
no  vertex  of  any  other  triangle  of  the  set.  A  slightly 
modified  version  of  Watson's  program  has  proved  to  be 
highly  satisfactory  it  is  efficient,  is  reliable,  and  yields  sen¬ 
sible  tessellations,  lor  example.  Ihe  one  in  figure  1  b 

The  error  in  interpolation  of  the  topography  discussed 
above  in  secondary  to  the  consequent  error  introduced  m 
the  computed  magnetic  anomaly  by  the  approximation  ol 
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TABLE  1  Rrns  Error  of  lnterpolalion  by  Equilateral  Triangles 


Side,  m  S-'  i  .  m 


50 

070 

100 

12 

200 

20 

300 

2.6 

500 

3  9 

1000 

64 

1500 

8  1 

2000 

107 

2500 

12  6 

3000 

14.4 

Root-moan-square  error  in  topography  after  interpolation 
by  an  equilateral  triangle  with  the  given  side  length.  The 
values  are  calculated  from  equation  (24)  and  the  constants  ol 
Fox  and  Hayes  [19851 


the  true  surface  of  the  seamount  by  an  artificially  smooth 
one  This  factor  will  be  treated  as  a  noise  term,  that  is,  a 
part  of  the  measured  field  not  fitted  by  the  model.  We 
have  no  control  over  the  contribution  from  approximation 
of  the  base  of  the  seamount  by  a  level  plane,  but  we  shall 
show  that  this  error  is  not  large  in  comparison  with  effects 
of  diurnal  variations  and  residual  crustal  fields.  The 
approximation  of  the  upper  surface  can  be  made  as  accu¬ 
rate  as  we  desire  by  choosing  the  triangle  size  sufficiently 
small  It  would  be  wasteful  of  computing  resources,  how¬ 
ever,  to  reduce  this  error  far  below  those  from  other 
sources. 

The  magnetic  effect  of  the  lower  surface  will  be 
modeled  by  a  thin  horizontal  layer  located  at  the  level  of 
the  surrounding  crust  with  variable  thickness  8.  the 
difference  between  the  true  topography  and  that  of  the 
model  The  same  kind  of  model  is  used  for  the  upper 
boundary,  but  we  choose  the  level  to  be  that  of  the  shal¬ 
lowest  part  of  the  seamount,  this  safely  overestimates  the 
noise  signal  We  simplify  the  magnetization  in  the  layer 
by  using  a  constant  vector  M  The  expected  squared  mag¬ 
netic  signal  is  found  by  integrating  the  power  spectrum  of 
the  magnetic  field  at  the  sea  surface: 

A B:  =  E[(6„  ■  AB )]•’ 

=  J  S,(k)\Utk)\:  d:k  (25) 

R-' 

where  ,S„  is  the  power  spectral  density  of  the  layer  thick¬ 
ness  and  where  t/(A)  is  the  (approximately  linear) 
transfer  function  between  topography  and  surface  mag¬ 
netic  field  given  by  Parker  [19731, 

L  ik  )  =  :r^(,e  '  "6,,  ( |A  1/  -  iA  )M  ■  ( |A  |z  -  tA  ),  |A  | 

where  is  the  depth  to  the  thin  layer  and  i  is  a  vertical 
unit  vector,  here  we  have  taken  only  the  the  first  term  of 
Parker's  series  and  converted  to  the  Fourier  transform 
conventions  set  out  in  appendix  B  The  complex  wave 
number  lerms  in  T'(A)  achieve  their  largest  magnitudes 
when  6,,  and  M  are  vertical,  thus  (25)  gives  the  following 
bound 

A B  J  7T -ft-c  "i\H\V,(A  I  2jtA  :  </A  (2b) 


A  for  | A  |.  From  this  equation  we  can  estimate  the  error 
introduced  by  giving  the  seamount  a  fiat  base.  In  this  case 
the  effective  area  of  the  triangles  is  very  large,  thus  we 
may  replace  SfilA  )  by  the  original  isotropic  spectrum  of  the 
bottom  topography,  S[A],  and  choose  ;u  to  be  the  local 
mean  depth  of  the  ocean  in  the  absence  of  the  seamount. 
In  appendix  B  we  give  an  expression  for  S  (A  1  derived 
from  the  data  of  Fox  and  Hayes  [1985],  using  the  power 
law  model: 


S [A  1  =  c,(A/„)  "i-" 


where  c,  =  27.400  mJ  for  young  volcanic  terrain  and  i) 
and  /„  are  as  before  Substituting  into  (26)  we  find 


A B:  <  2irVfflM|-V,/0-,r(3-ij) 

to?; 


47t; 


(27) 
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for  |M[  =  10  A  m  '.a  reasonable  figure  for  young  oceanic 
basalts  [Vaequier,  1971],  Thus  we  estimate  the  rms  error 
from  neglecting  the  roughness  of  the  base  of  the 
seamount  ranges  from  about  9  nT  in  shallow  water,  say 
r,i  =  1500  m.  to  about  5  nT  in  water  4  km  deep. 

The  major  difficulty  in  using  (26)  for  the  upper  surface 
of  the  seamount  is  in  calculating  the  proper  spectral  den¬ 
sity  SJA]  We  simplify  the  interpolation  process  by 
approximating  it  as  a  low-pass  filter  that  never  magnifies 
the  original  topographic  signal  and  whose  gain  falls  with 
increasing  A.  Thus  we  treat  the  residual  6  as  a  high- 
passed  version  of  the  topography.  From  the  analysis  of 
appendix  B  we  have  one  other  property  of  the  filter:  the 
total  variance  of  the  residual  topography,  7>-  r,  gives  us 
the  power  gain.  Of  all  the  filters  obeying  these  constraints 
we  find  ihe  one  acting  on  .S' [A  I  that  gives  the  largest  possi¬ 
ble  variance  in  A B.  Thus  even  though  we  may  not  know 
the  exact  form  of  5* [A]  we  can  still  set  an  upper  limit  on 
the  interpolation  error.  It  can  be  shown  by  the  application 
of  the  principles  of  linear  programming  [see  Luenberger, 
I <969]  that  the  optimal  filler  is  a  pure  high-pass  filter  that 
rejects  all  energy  below  some  critical  wave  number  and 
passes  a  signal  above  that  with  constant  gain.  Using  this 
result  and  the  expression  for  the  power  spectrum  of  the 
field,  we  can  maximize  Ihe  variance  with  respect  to  the 
two  free  parameters  of  the  filter:  its  gain  and  the  cutoff 
wave  number  After  some  lengthy  algebra  we  obtain  the 
results  about  to  be  summarized.  Define  a  dimensionless 
quantity  <r  by 


(r;-  1 )  i 

2771  1  Ml 


The  largest  possible  mean  square  magnetic  field  error  due 


to  neglect  of  the  surface 

roughness  is 

estimated  to  be 
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where  it  has  been  assumed  that  the  spectrum  is  isotropic 
(that  is.  depends  only  upon  A  )  and  we  have  substituted 
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Fig.  3  Upper  hound  on  the  rms  error  in  the  magnetic  anomaly 
due  to  smoothing  of  the  surface  of  the  seamount  h>  interpolation 
is  the  water  depth  over  the  shallowest  point  of  the  seamount, 
rv  ,  is  the  rms  interpolation  error  The  magneti/ation  is  taken 
to  be  10  A  m  and  the  error  varies  in  proportion  to  this  figure 
Contour  values  are  in  nanoteslas 


where  £)(</)  =  </’’  ‘I' (t/ .  3 — I  and  (in  this  definition 
only!)  Ftr/.r;)  is  the  incomplete  gamma  function 
[Abramowitz  and  Stegun.  1965.  chap.  6] ;  r/.  is  defined  to  be 
the  positive  1 1  that  makes  Qt r/ )  maximum,  and  with  the 
value  of  Tj  that  we  have  adopted.  </.  =  l  ! 63  and 
Q(q,  )  =  0.286.  In  Figure  3  we  have  contoured  the  rms 
magnetic  field  in  nanoteslas  from  these  expressions,  using 
the  value  of  10  A  m  for  |M|  Combining  the  informa¬ 
tion  in  Table  1  with  that  in  Figure  3  we  may  conclude  that 
depending  on  the  water  depth,  triangles  with  sides 
between  500  m  and  2000  m  can  be  used  without  introduc¬ 
ing  errors  larger  than  5  to  10  nT. 

The  numerical  evaluation  of  the  integrals  in  (23) 
represents  the  greatest  computational  burden  in  practical 
calculations.  A  large  computational  saving  is  achieved  by 
reducing  the  volume  integrals  over  V  to  integrals  over  the 
surface  9  I  following  Parker  [1971],  we  write  (231  in  the 
form 


4tt 


/ 


Vf  ■  Vf  tP's 


(28) 


where 

P,  =  6  ■  Vr-Ly 

lr ,  -s  I 

which  holds  because  B,,  is  a  constant  vector.  Next  con¬ 
sider  the  following  identity,  which  is  valid  for  sufficiently 
smooth  functions: 


Vf  ■  Vf  =  V  •  (f  Vf )  -  f  V  -f 

All  the  measurement  positions  lie  outside  the  seamount: 
therefore  ]rt—  s|  >  0,  and  so  V;|rt-s|  '=0  for  all 
s  €  V.  Since  V;  commutes  with  &„■  V,  this  implies 
V2f  =  0.  and  thus 


V F,  ■  Vf,  =  V  ■  (FVF, ) 

Substituting  this  into  (28)  and  applying  Gauss'  theorem 
gives 


I',*  = 


Ft  i 


An 


J  f  Vf  n  (s  )  <f s  (29) 


where  n(s)  is  the  unit  normal  to  the  surface  at  the  point 


s  €  9  I  We  remark  that  there  is  no  guarantee  that  n(si 
exists  for  any  surface  point  nor  do  we  know  if  Gauss' 
theorem  is  valid  for  a  surface  whose  power  spectrum  is 
like  that  in  equation  (B7);  since  we  apply  this  formula  in 
the  simplified  body  with  triangular  facets,  no  dillivulty 
arises,  for  the  theorem  is  undoubtedly  correct  for  regions 
with  a  piecewise  plane  boundary  [KcIIouk-  1953.  chap  4] 
The  computation  of  fhe  Gram  matrix  has  been  reduced 
to  the  evaluation  of  (29)  over  the  set  of  plane  triangles  by 
which  we  have  approximated  the  surface  of  the  body  For 
programming  simplicity  we  also  lessellate  the  lower,  plane 
boundary  with  the  projections  of  ihe  triangles  defining  the 
upper  surface  The  elements  of  the  matrix  A  can  be  con¬ 
verted  to  surface  integrals  by  an  analogous  process  The 
surface  integrals  for  the  Gram  matrix  cannot  be  performed 
in  terms  of  elementary  functions  except  in  certain  special 
cases:  therefore  we  use  a  numerical  integration  formula, 
technically  known  as  a  "cubature''  rule,  designed  for  tri¬ 
angular  regions.  Stroud  [1971]  gives  a  complete  review  of 
this  question  and  provides  several  examples  of  the 
required  type  The  integral  of  a  smooth  function  defined 
for  points  m  ihe  triangle  T  is  approximated  by  a  weighted 
sum  of  samples  of  the  function: 

f  /  (V )  d:\'  =  T  w,  /'  (.V  ) 

*7  ' 

By  an  appropriate  choice  of  tv,  and  V  it  is  possible  to 
make  fhe  cubature  formula  exact  for  all  polynomial  func¬ 
tions  in  the  plane  with  degree  less  than  some  upper  limit. 
/>.  these  are  called  degree-/;  formulas.  This  is  equivalent 
to  the  familiar  Gaussian  quadrature  method  on  the  real 
line  The  theory  for  functions  of  more  than  one  variable 
is  far  from  complete:  for  example,  the  smallest  number  ol 
sampling  points  that  will  yield  a  degree-/;  rule  is  not 
known  in  general  It  is  nonetheless  possible  to  generate  a 
suboptimal  formula  by  taking  combinations  of  one¬ 
dimensional  Gaussian  rules  in  a  so-called  “conical  pro¬ 
duct."  The  optimal  degree-5  rule  for  triangular  regions  is 
known  (formula  *f:  of  Stroud  11971.  p.  314]):  it  uses 
seven  sampling  points  In  contrast,  the  degree-5  conical 
product  formula  requires  nine  sampling  points.  With  a 
complete  theory  ,  we  would  be  able  to  choose  a  cubature 
rule  with  guaranteed  accuracy  for  every  surface  integral 
but  with  the  presently  available  methods  this  is  impracti¬ 
cal.  Even  for  simple  functions  bounding  the  error  of  this 
kind  of  approximation  is  difficult.  Furthermore.  Stroud' s 
[1971.  chap.  5]  examples  give  the  impression  that  the 
bounds  yielded  by  the  available  methods  are  of  the  crudest 
kind,  overestimating  the  true  error  by  several  orders  of 
magnitude  in  almost  every  case.  Sard's  11963]  theory, 
summarized  by  Stroud,  provides  a  useful  result  for  pur¬ 
poses  of  comparison:  for  an  integration  rule  of  degree  />. 
the  error  depends  principally  upon  the  magnitude  of  the 
largest  derivative  of  the  order  of  p  +  1:  this  rests  on  the 
assumption  that  the  integrand  possesses  all  these  deriva¬ 
tives  which  is  true  for  our  functions  because  they  are  ana¬ 
lytic.  Roughly  speaking,  the  integrand  of  (29)  behaves 
like  9 :R,  '9  !/?i  1  where  9  is  a  first-order  differential  opera¬ 
tor  acting  in  the  plane  of  the  triangle  and  R  is  the  dis¬ 
tance  from  the  observer  j  to  a  point  in  the  triangle  (and 
similarly  for  R.  ).  Combined  with  Sard's  result,  this  sug¬ 
gests  that  the  largest  errors  occur  in  contributions  to  the 
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Tig  4  Relative  error  as  a  tundion  *4  Jisian.e  o|  lour  cuhjturc 
rule".  (  -  t',,.  (  .inti  c  .  t  in  a  seven  point  i <pnni.it  degree -5 
rule  The  olhers  are  eimie.il  product  iules  ot  degrees  '  '  ami  9 
The  inlegrjls  are  the  coninhuti"ns  Id  ’he  diagonal  element  <4  the 
Ciram  matrix  ul  an  equilateral  laeet.  Mile  v  and  the  nhsc-rcei  over 
the  eenter  i4  the  triangle  at  a  distance  /> 


diagonal  elements  ol  1  h>  lavts  dose  u>  the  measurement 
position  heeause  these  are  the  eoiielilinns  under  which  the 
highest  derivatixes  arise 

As  was  noted  earlier,  some  ol  the  integrals  in  1.N1  v.m 
be  evaluated  in  elementary  functions  this  happens  when 
i  =  6  =  n  and  i  =  *  .  that  is.  on  the  diagonal  ol  the 
Gram  matrix  The  expressions  are  complicated  and  will 
not  be  set  out  here  B>  comparing  the  exact  and  numeri¬ 
cal  values  Tor  these  diagonal  elements  we  can  assess  the 
performance  of  the  approximation  when  it  is  likely  to  be  -it 
its  worst  In  Figure  4.  we  illustrate  ihe  performance  ol 
four  numerical  eubature  rules  over  a  range  of  observer- 
triangle  distances  The  integration  is  carried  out  over  an 
equilateral  triangle,  and  the  observer  position  moves  on  a 
line  normal  to  the  facet  passing  through  the  centroid,  all 
the  distances.  D .  shown  on  the  horizontal  axis  are  mea¬ 
sured  to  the  closest  point  in  the  facet  normalized  by  the 
side  The  top  line,  marked  C  .  gives  the  relative  error  as 
a  function  of  distance  when  the  optimal  seven-point  rule  is 
used  The  most  important  point  to  note  is  the  strong 
dependence  of  the  error  on  distance  between  the  observa¬ 
tion  point  and  the  facet.  As  is  expected,  the  error  is  gen¬ 
erally  smaller  the  more  distant  the  observer  is  We  can  be 
more  precise  by  using  Sard's  result:  the  formula  is  exact 
for  fifth-degree  polynomials,  so  the  absolute  error  should 
be  be  bounded  by  some  constant  times  the  maximum 
magnitude  of  the  sixth  derivative  of  the  integrand  found 
in  the  region  of  integration.  A  short  calculation  involving 
spherical  harmonies  shows  that  for  large  distances  the 
sixth  derivative  of  the  integrand  should  decrease  as  the 
inverse  eleventh  power  of  distance  while  the  value  itself 
falls  as  the  inverse  fifth  power;  thus  the  relative  error 
ought  to  decrease  with  the  inverse  sixth  power  of  distance. 
Our  numerical  calculations  confirm  that  the  error  falls  at 
this  rate  asymptotically.  The  line  labeled  C\,  shows  the 
error  resulting  from  the  nine-point,  degree-5  conical  pro¬ 
duct  rule;  the  error  here  also  falls  like  the  inverse  sixth 
power.  Although  the  error  is  less  at  every  distance  for  (  „ 
than  for  the  improvement  is  slight  in  view  of  the  need 
for  two  more  integrand  evaluations  The  behavior  of  the 
16-point,  conical  product  rule  is  shown  by  the  line  C|„.  the 
rule  is  exact  for  polynomials  of  degree  7,  and  so  for  large 
D  the  relative  error  drops  as  the  inverse  eighth  power 


Similarly,  the  25-point,  conical  product  formula  is  a 
degree-9  rule,  although  the  behavior  of  the  relative  error 
curve  is  not  i  simple  monotone  decrease  with  I)  We 
have  expeil, ..ented  with  a  variety  of  triangle  shapes  m  the 
comparison  with  the  exact  integration  we  conclude  that  it 
the  distance  is  normalized  by  the  maximum  length  ol  a 
side.  Figure  4  gives  an  upper  bound  on  the  relative  error 
and  so  it  may  be  used  as  a  guide  to  the  expected  accuracy 
of  the  different  eubature  rules 

Based  upon  the  foregoing  discussion,  we  can  develop  a 
strategy  for  efficient  numerical  integration  Because  the 
error  exhibits  such  a  strong  functional  dependence  on  ihe 
ratio  of  observer  distance  to  triangle  side  we  should  keep 
this  ratio  carefully  under  control  In  the  approximation  o! 
the  seamount  hy  triangular  facets,  the  sides  of  every  trian¬ 
gle  are  arranged  to  be  shorter  than  the  water  depth  over 
the  shallowest  corner  times  a  factor,  in  practice  we  have 
insured  tfi.it  ihe  longest  side  ot  anv  triangle  never  exceeds 
twice  the  water  depth  This  arrangement  guarantees  that 
the  ra'io  /)  s  ol  figure  4  is  alwavs  gie.itei  than  one 
halt  l  sing  the  equilateral  triangle  data  as  a  basis  lor  an 
error  model  we  couc'ude  that  with  the  In  poult  conical 
product  rule  the  relative  ettot  should  remain  below  one 
part  m  a  hundred  tor  the  diagonal  elements  and  heiti  • 
than  that  lor  the  ofT-diagoiiai  ones  The  coordinates  and 
weights  lor  ibis  formula  are  given  in  I  able  2  To  check 
the  actual  accuracy  ol  the  integration,  we  have  tested  the 
scheme  in  a  lew  cases  hv  applying  (  and  (  ,  to  the  same 
body.  this  test  indicates  that  the  error  estimate  is  conser¬ 
vative  and  that  we  alwavs  achieve  accuracies  ol  a  lew  parts 
in  a  thousand 

We  must  understand  how  precise  the  approximation  to 
the  elements  ol  the  various  matrices  needs  to  he  We 
briefly  consider  the  perturbation  theory  lot  the  solution  to 
120)  which  we  abbreviate  by 

tit  =  b 

The  right  side  represents  data  (magnetic  anomaly  values). 
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Sampling  points  and  weights  ot  Ihe 
sixteen-point,  degree-seven,  conical  product 
eubature  rule  on  the  right-angled  triangle 
with  corners  (0.  0).  (0.  I).  and  II.  0)  con¬ 
structed  as  described  by  Stniiul  [  1 97 1 1  The 
corresponding  values  for  any  other  triangle 
may  he  found  by  a  simple  linear  transforma¬ 
tion 
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and  these  are  known  at  best  to  about  a  few  parts  in  10:. 
Let  i  +  t  be  the  solution  to  a  slightly  perturbed  data  vec¬ 
tor  b  +  e ,  and  for  the  moment,  take  5  to  be  exact.  Then 


B  (£  +  ( )  =  b  +  e 


and  it  is  well  known  ie.g.,  Golub  and  Van  Loan ,  1983]  that 
the  perturbation  in  the  solution  is  bounded  as  follows: 


Hell 

11*11 


where  k  is  the  condition  number  of  B  given  by 


(30) 


k(B)  =  II5II  115  'll 


Here  we  take  the  vector  norm  to  be  the  ordinary 
Euclidean  length,  and  we  take  the  Frobenius  norm  for  the 
matrices,  the  square  root  of  the  sum  of  elements  squared. 
The  condition  number,  which  is  never  less  than  unity, 
governs  the  way  in  which  perturbations  in  the  data  are 
magnified  in  the  solution  vector.  It  is  impossible  to 
predict  the  value  of  k  from  general  principles  although  we 
expect  it  to  increase  with  /a  in  (20).  If  the  answers  we 
obtain  are  to  be  useful,  they  must  not  be  sensitive  to 
small  errors  in  *,  and  therefore  the  condition  numbers  of 
our  matrices  should  be  less  than  10:,  and  we  have  checked 
this  in  actual  examples.  Now  let  us  examine  how  errors  in 
the  matrix  5  alter  the  solution;  let  the  numerical  approxi¬ 
mation  to  the  true  5  be  5  +  E  and  the  correspondingly 
corrupted  solution  vector  be  £  +  e':  obviously 

(5  +  £>(£  +  e')  =  b 


and  then  there  is  a  companion  result  of  (30)  that  stales 
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l!£ll 

II5II 


provided  k  (5)ll£ll/ II5II  <  1.  We  show  that  with  an 
appropriately  small  relative  error  in  the  numerical  integra¬ 
tion,  the  right  side  of  the  above  inequality  is  small  in  com¬ 
parison  with  the  right  side  of  (30).  If  the  relative  error  in 
cubature  is  always  less  than 


14  I  <  *'14/ 1 

because  in  (20)  the  diagonal  of  F  is  positive.  From  the 
definition  of  the  Frobenius  norm  IICII/II5II  <  i«.  It  fol¬ 
lows  that  the  cubature  error  introduces  effects  in  the  solu¬ 
tion  small  in  comparison  to  those  due  to  data  uncertainty 
if  the  relative  precision  is  an  order  of  magnitude  smaller 
than  the  relative  error  in  the  measurements.  This  is  why 
we  have  set  the  target  for  the  level  of  accuracy  in  the 
cubature  rule  at  a  few  parts  in  a  thousand. 

4  Theory  II:  Appraising  the  Solution 

A  complete  analysis  of  any  inverse  problem  includes  an 
assessment  of  the  reliability  of  the  solution.  This  is  com¬ 
monly  provided  by  an  estimate  of  the  "resolution"  associ¬ 
ated  with  the  given  set  of  data  In  our  problem  that  kind 
of  study  is  inappropriate  for  two  reasons.  First,  it  requires 
a  theory  capable  of  dealing  with  a  vector-valued  function 
in  three  dimensions,  making  the  display  and  computation 
of  the  results  extremely  complicated  Second,  even  exact 
values  of  the  magnetic  field  measured  at  every  exterior 
point  are  incapable  of  yielding  a  unique  magnetization 


model  which  implies  that  the  resolution  of  any  conceivable 
set  of  data  will  be  poor. 

Recall  that  the  volume  of  the  seamount  times  the  mean 
magnetization  is  just  the  total  dipole  moment  of  the  body, 
and  if  magnetic  anomaly  data  were  available  everywhere 
(or  just  on  a  sphere  enclosing  the  seamount),  the  dipole 
moment  could  be  deduced  exactly  from  the  field  data. 
Therefore  some  aspects  of  the  model  can  be  obtained 
unambiguously  in  ideal  circumstances,  and  we  shall  focus 
on  the  average  magnetization,  IF  The  direction  of  the 
associated  vector  allows  the  calculation  of  the  pole  position 
of  the  magnetic  field  at  the  time  of  formation  of  the 
seamount  if  we  assume  the  seamount  formed  rapidly 
enough  that  the  motion  of  the  tectonic  plate  may  be 
neglected,  and  if  we  accept  the  axial  dipole  hypothesis 
[ McElhinnv ,  1973,  chap.  6],  The  fluctuations  of  the  secu¬ 
lar  variation  are  expected  to  average  to  zero  over  a  period 
of  formation  of  the  seamount.  The  validity  of  the  paleo- 
pole  calculation  might  hold  even  if  the  seamount  captured 
a  polarity  reversal,  provided  the  opposite  polarities  are  not 
present  with  exactly  equal  magnetic  moments.  Thus, 
instead  of  assessing  the  quality  of  our  model  at  every  inte¬ 
rior  point  with  an  analysis  of  resolution,  we  seek  the 
uncertainty  in  the  single  important  property,  the  average 
magnetization  vector  Since  this  is  a  linear  functional  of 
the  unknown  model  we  take  up  an  idea  of  Backus  [1970] 
and  Parker  [1977]  on  bounding  collections  of  linear  func¬ 
tionals  in  Hilbert  space;  our  approach  differs  from  these, 
however,  because  of  the  way  in  which  we  handle  uncer¬ 
tainty  in  the  measurements. 

Even  though  a  complete  knowledge  of  the  external 
magnetic  field  does  uniquely  determine  U.  practical  mag¬ 
netic  anomaly  data  cannot.  This  is  shown  as  follows: 
choose  l’  arbitrarily,  and  demand  that  a  model  magnetiza¬ 
tion  simultaneously  possess  this  average  and  satisfy  the 
given  data.  The  requirement  of  a  particular  average  mag¬ 
netization  may  be  written 

(X„ .  M)  =  (X„ .  U )  n  —  1,2,3 

Since  the  right  side  can  be  calculated,  we  have  three  addi¬ 
tional  equations  to  be  included  with  the  N  in  (5)  provided 
by  observation.  Because  the  enlarged  set  of  representers, 
including  the  three  artificial  ones  X,,  X;,  and  X-,.  remains 
linearly  independent,  the  associated  Gram  matrix  is  non¬ 
singular,  and  therefore  it  is  always  possible  to  find  a  mag¬ 
netization  M  exactly  matching  the  observations  for  any 
choice  of  U  whatever.  The  proof  of  linear  independence 
is  an  easy  extension  of  the  one  given  in  appendix  A.  This 
negative  result  is  no’  as  upsetting  as  it  might  at  first 
appear:  when  unreasonable  L’  are  chosen,  the  models  gen¬ 
erated  may  be  unacceptable  because  fitting  the  data 
together  with  an  "unnatural"  mean  magnetization  may 
cause  extremely  large  intensities  and  rapid  variations  of 
the  solution  on  a  small  scale.  For  example,  any  model 
with  rms  magnetization  of  100  A  m  1  could  be  rejected 
even  if  it  did  explain  the  magnetic  anomaly  precisely:  such 
an  intensity  is  considerably  outside  the  range  of  observed 
values  for  marine  basalts  established  by  extensive  direct 
sampling  in  the  Deep  Sea  Drilling  Project  of  the  crust 
[BU'il  and  Petersen,  1983)  and  the  sparser  sampling  of 
seamounts  themselves  [A'/iiih.  1977],  Since  there  is  a  gen¬ 
erally  agreed  upon  upper  limit  of  magnetization  for  a  plau- 
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sible  model,  this  factor  may  be  included  as  part  of  the 
information  to  be  used  in  determining  the  reliability  of  the 
mean  magnetization. 

The  Hilbert  space  P  is  apparently  suited  to  this  problem 
because  its  norm  is,  after  a  scaling,  just  the  rms  magneti¬ 
zation.  An  acceptable  model  is  one  obeying  the  two  con¬ 
straints: 

(M.M)  <  KM;U(  (31) 

\ 

£  [d,  -  (G,.M)F  6'3  (32) 

i- 1 

where  V  is  the  volume  of  the  seamount  (Strictly  we 
should  write  m(F),  that  is,  the  measure  of  the  set  I  ) 
The  average  magnetization  of  an  arbitrary  M  €  P  is  given 
by 

AIM]  =  r  1  £  X„(X,,,M) 

n-  I 

This  follows  from  the  definition  of  the  orthogonal  uniform 
elements  X„  and 

(X,„,X„)=  F8„,„ 

We  ask  how  far  in  the  sense  of  the  norm  the  mean  mag¬ 
netization  can  be  from  l  .  and  still  satisfy  both  the  data 
and  a  constraint  that  the  rms  intensity  be  bounded  by 
,V/nuy.  By  solving  this  maximization  problem  we  find  that 
limits  on  the  uncertainty  in  the  mean  magnetization:  if  we 
find  quite  large  values  of  ,V/nm  yield  a  small  range  for 
A[M],  the  data  may  be  said  to  determine  the  mean  mag¬ 
netization  well:  if,  however,  the  range  in  A|M]  is  large 
even  when  A/m.,v  approaches  values  within  observational 
experience,  say  10  A  m  ',  it  must  be  concluded  that  the 
true  average  magnetization  is  not  strongly  constrained  by 
the  data.  As  we  shall  see,  for  the  seamounts  we  have 
studied  so  far  the  second  alternative  seems  to  apply:  the 
rigorous  analysis  of  uncertainty  is  disappointing  and  leads 
to  the  unwelcome  conclusion  that  a  very  wide  range  of 
average  magnetizations  are  compatible  with  the  data  and 
the  imposition  of  a  prior  upper  limit  on  rms  magnetiza¬ 
tion.  We  have  reasons  for  believing  the  true  uncertainty 
is  much  less  than  indicated  by  the  bound  given  in  the 
theory;  we  will  take  up  this  matter  again.  But  first,  we 
sketch  the  theory  for  obtaining  a  strict  bound. 

We  write  the  mathematical  problem  as  a  minimization, 
rather  than  a  maximization:  we  seek  the  element  of  P 
obeying  (31)  and  (32)  that  minimizes  -  II A  [M]  -  li.ll3. 

As  in  the  previous  problem  containing  inequalities,  we 
appeal  to  the  John  multiplier  theorem,  constructing  the 
functional 

F [M ]  =  -  II A  [M]  -  U.ll2 

V 

+  //(£  Id,  -  (G,,M)]j|  +  a(M.M)  (33) 

i 

where  A,/x  >  0  are  two  John  multipliers  associated  with 
complementary  slackness  conditions 

//!£  y,  -  (G,,M)F-  S:\  =  0 

- 1 

a((M.  M)  -  I'ML,  I  =  o 

Once  more,  we  expect  that  neither//  nor  A  will  vanish  and 


that  (31)  and  (32)  will  be  equalities  in  all  cases  of  practical 
importance.  For  this  problem  we  use  a  variational 
approach:  the  stationary  points  of  (33)  are  located  by  tak¬ 
ing  the  Gateaux  derivative  of  F  [see  Smith,  1974.  chap  2) 
and  setting  it  to  zero.  The  Gateaux  derivative  is 

AF[M]  =  -2£  [F(X„.  M)-(X„.  L.)]X„ 

«-  i 

\ 

+  2aM  +  2//  £  [(G  , ,  M  >  -  d,  ]G  (34) 

■  .  i 

Suppose  this  vanishes  when  M  =  M|.  Then,  because  X  is 
positive,  (34)  can  be  rearranged  thus 

V  J 

M,  =  £  a. G  *  £  h„\„  (35) 

i  i 

where  the  coefficients  a  and  h.  are  various  combinations 
of  inner  products  in  the  representers  and  uniform  ele¬ 
ments.  In  other  words,  the  stationary  value  of  F  occurs  at 
an  element  of  P  that  is  a  linear  combination  of  elements 
G,  and  X„.  which  puts  M;  in  the  same  subspace  of  P  as 
L',.  Equation  (35)  is  substituted  into  (33).  and  the 
minimization  problem  is  reduced  to  one  in  a  finite- 
dimensional  space,  finding  the  appropriate  coefficients  a, 
and  b„.  The  equations  are  expressed  in  matrix  form  if  all 
the  inner  products  are  perfoimed.  We  find 

I  A 

F  =  —II  1.4  1  llk-liU'+Xc1  I  Vl  c+//ll[['  die- dll3 

=  -Hfliii  -  fi II :  +  Xc'  B\i  +  fxWB^c  -  dll'  (36) 

where  the  vector  i  €  Rs  '  is  a  composite  of  the 
coefficients  a,  and  b,  , 

c  =  (tt|,  i): . ,  6).  b:.  bx)' 

and  B  €  R’  is  the  vector  of  coefficients  defining  l »  in  the 
uniform  basis.  In  (36)  the  norms  are  ordinary  Euclidean 
lengths  of  vectors  and  /  is  the  3x3  unit  matrix:  the  other 
matrices  and  vectors  are  the  same  as  those  appearing  in 
section  2:  the  correspondence  between  the  matrices  Bu. 
B:.  5-,,  and  the  ones  in  (36)  is  straightforward. 

Viewed  as  a  quadratic  form  in  <  .  the  functional  F  has 
only  one  stationary  point  which  can  be  found  b> 
differentiation:  for  any  given  John  multipliers//  and  X.  F 
is  stationary  when 

c  —  (—B/,B„  +  \B\  +  fxB{BA  '(fxBld  -  Bi/3)  (37) 

The  next  problem  is  to  find  the  appropriate  multipliers  so 
that  the  data  misfit  and  the  maximum  allowable  norm  are 
reproduced,  the  classical  problem  of  unknown  multipliers 
We  consider  (37)  to  define  the  vector  c  as  a  function  of  A 
and  /x ,  that  is.  c  =  c  (X,  ft.)  In  these  terms  the  two  condi¬ 
tions  to  be  obeyed  are  that  f\  (A  .  ft )  =  /:(A ,  n  )  =  0, 
where 

/ |(A,/x)  =  r(A,//)'fllc(A,/z)  -  ML,  (38) 
,/;(A ,  fx)  =  \\B:c(X.fx)  -  dll3  -  53  (39) 

In  principle  we  can  apply  Newton's  method  to  solve  this 
pair  of  equations  once  the  derivatives  are  known: 
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and  from  (37) 


=  —  (— Bf5o  +  XS,  +  yuB^F'SiC 

ax 

=  (-5£50+  XB,  +  pB(B.) --'Bf  (</  -  5>c) 

3m 

Here  the  functional  under  minimization  is  concave,  not 
convex,  and  so  uniqueness  of  the  solution  cannot  be 
guaranteed.  In  particular,  there  may  be  more  than  one 
(X,/u)  pair  giving  rise  to  the  root  /,(X.ji)  = 
/:(X,,u)=0,  that  is,  the  desired  values  of  norm  and 
misfit.  If  there  are  several  solutions,  we  must  choose  the 
one  that  gives  the  largest  value  of  \\B0c  -  £ll:.  In  the 
absence  of  any  theoretical  results  on  this  matter,  the  only- 
safe  procedure  is  to  explore  the  positive  quarter  plane 
X ,  n  >  0,  evaluating  a  measure  of  misfit  to  the  root  such 
as  / r  +  fi.  Where  this  indicator  function  is  small,  the 
iterative  process  can  be  invoked  to  produce  a  precise  root. 

If  (37)  is  used  together  with  (38)  and  (39)  the  search 
process  would  be  very  expensive  computationally,  since 
for  every  h,p  point  we  must  solve  (37),  a  linear  system 
of  the  order  of  .V,  =  ,V  +  3,  where  ,V  is  the  number  of 
magnetic  anomaly  observations,  typically  of  the  order  of 
150.  After  advantage  is  taken  of  the  symmetry  of  the 
matrix,  each  solution  requires  of  the  order  of  Vf/6  opera¬ 
tions,  or  "floptC  [ Golub  and  Fur;  Loan,  1983,  p.  90],  We 
can  entirely  avoid  these  costly  solutions  at  the  expense  of 
two  initial  spectral  factorizations;  the  details  are  set  out  in 
appendix  C.  In  most  cases,  the  rearrangement  described 
in  appendix  C  can  achieve  reductions  in  computational 
cost  of  a  factor  of  200  or  more. 

Having  found  the  vector  c  that  minimizes  the  func¬ 
tional  F  of  (36)  we  compute  the  corresponding  error  cone 
in  the  mean  magnetization  vector  as  follows.  The  vector 
u  6  R'  given  by 


u  =  Bac  =  \A  '  Vl\c 

is  the  set  of  expansion  coefficients  of  A[M],  the  average 
magnetization  of  the  solution  M,  in  the  basis  of  uniform 
elements  X„ .  We  may  treat  a  exactly  as  the  the  uniform 
part  of  the  magnetization  of  the  seamount;  the  vector  B 
plays  the  same  role  for  U«.  Thus  the  angle  between  u  and 
5.  namely,  cos  1  (w7/3/l!z/t(  11/311)  is  an  upper  limit  on  the 
angular  uncertainty  in  our  determination  of  the  direction 
of  average  magnetization,  which  can  easily  be  converted 
into  the  error  of  the  paleomagnetic  pole  position.  With 
considerably  more  work  one  could  obtain  the  cone  of 
uncertainty,  by  maximizing  the  angle  in  all  the  planes  con¬ 
taining  the  vector  P\  the  figure  would  be  entirely  con¬ 
tained  within  the  cone  we  have  just  found  and  would  cer¬ 
tainly  not  have  a  circular  section.  The  complexity  of  the 
maximization  problem  and  the  rather  poor  performance  of 
the  present  approach  with  field  data  deters  us  from  pursu¬ 
ing  the  question  further. 


5.  Application  to  Field  Data 

We  now  apply  the  theory  to  a  seamount  survey  per¬ 
formed  in  the  South  Pacific.  P.  F.  Lonsdale  collected  the 
data  on  the  R/V  Thomas  Washington  of  the  Scripps  Insti¬ 
tution  of  Oceanography  during  leg  6  of  the  Marathon 
Expedition  in  September  of  1984.  The  seamount  chosen 
for  study,  at  48.2°S,  148.8°W,  designated  LR148.8W  on 
account  of  its  longitude,  is  the  youngest  of  a  long  series  to 
be  found  on  the  Louisville  Ridge.  Total  field  magnetic 
values  were  measured,  and  precise  bathymetric  informa¬ 
tion  (contoured  in  Figure  5)  was  available  through  a  Sea 
Beam  system.  Global  Positioning  System  Navigation  was 
available  for  a  part  of  the  survey  and  for  the  remainder, 
navigation  relied  upon  dead  reckoning  based  upon  the 
Doppler  log. 

Morphological  and  petrological  evidence  [ Hawkins  et  al., 
1985]  supports  the  idea  that  the  Louisville  Ridge  is  a  hot 
spot  chain  similar  to  the  Hawaii-Emperor  chain  in  the 
North  Pacific  but  on  a  somewhat  smaller  scale;  it  appears 
to  have  been  active  for  a  comparable  period  of  time. 
LR148.8W  is  a  large  edifice  at  the  southern  and  therefore 
younger  end  of  the  chain.  It  is  believed  to  be  quite 
recent,  less  than  10  m.  y.  in  age.  We  expect  that  the  vir¬ 
tual  geomagnetic  pole  (VGP)  of  the  uniform  part  of  the 
magnetization  should  lie  close  to  the  present-day  rotation 
axis  of  the  earth,  assuming  the  effects  of  secular  variation 
have  been  averaged  out  during  the  time  of  construction. 
From  Figure  6  we  see  that  the  magnetic  anomaly  is  com¬ 
plex  and  does  not  resemble  that  of  a  uniformly  magne¬ 
tized  body  (Figure  7),  perhaps  suggesting  the  presence  of 
both  normal  and  reversed  magnetization,  although  normal 
material  clearly  predominates. 

The  original  records  were  prepared  for  inversion  as  fol¬ 
lows.  The  Sea  Beam  bathymetric  data  were  sampled  with 
a  spacing  approximately  proportional  to  the  local  water 
depth  in  order  to  ensure  accurate  numerical  cubature  in 
the  computation  of  the  Gram  matrix  elements  as  dis¬ 
cussed  in  section  3.  A  total  of  295  samples  went  into  this 
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Fig.  5  Bathymetry  of  LR148.8W  contoured  at  250-m  intervals 
with  1000-m  levels  plotted  as  heavy  solid  lines  The  solid  circles 
give  the  locations  of  the  magnetic  field  intensity  observations  used 
in  the  inversion  procedure.  The  interior  dashed  box  is  the  boun¬ 
dary  or  Figures  6.  7.  and  12.  This  box  measures  approximately  44 
km  by  34  km. 
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Fig-  6.  Magneiic  anomaly  associated  with  LR148.8W,  contoured 
at  an  interval  of  150  nT.  Negative  contours  are  dashed  lines,  and 
the  zero  level  is  shown  as  a  heavy  solid  line  The  maximum 
anomaly  contour  is  1200  nT  Notice  that  the  scale  of  this  map  is 
slightly  larger  than  that  of  Figure  5. 

description.  The  coordinates  in  the  horizontal  plane  were 
organized  into  a  tessellation  best  approximating  equilateral 
triangles  by  Watson’s  [1982]  algorithm,  yielding  565  trian¬ 
gles  (Figure  8).  When  the  appropriate  vertical  coordinate 
is  assigned  to  each  vertex,  the  result  is  a  model  for  the 
upper  surface  of  the  seamount  in  terms  of  triangular 
facets;  the  lower  boundary  of  the  model  is  the  horizontal 
plane  bounded  by  the  4250-m  contour.  We  showed  in 
section  3  that  replacement  of  the  true  bottom  surface  by  a 
plane  and  of  the  upper  surface  by  triangular  facets  intro¬ 
duces  quite  negligible  error  into  the  calculations.  The 
model  seamount  is  shown  in  Figure  9  as  it  would  be  seen 
by  an  observer  in  the  south  looking  in  a  direction  20°  west 
of  north  and  downward  10°  below  horizontal;  in  this  figure 
there  is  a  factor  of  5  vertical  exaggeration.  Notice  the 
almost  flat  lop  and  nearly  pseudo  sphere  appearance  of  the 
flanks  of  the  volcano.  Magnetic  anomaly  values  were 
computed  from  the  original  intensity  measurements  by 
subtracting  a  standard  main  field  model.  International 
Geomagnetic  Reference  Field  80  [Peddle  and  Fahiano , 
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Fig  7,  Best  approximation  to  the  observed  anomaly  of  Figure  6 
by  means  of  an  uniform  internal  magnetization,  contoured  at  an 
interval  of  150  nT.  Negative  contours  aic  dashed  lines,  and  the 
zero  level  is  shown  as  a  heavy  solid  line  The  maximum  contour 
is  only  750  nT.  Three  additional  parameters  describing  a  linearly 
varying  background  field  have  been  filled,  to  allow  for  small 
errors  in  the  main  field  model 
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Fig.  8  Plan  view  of  the  triangular  facet  model  of  LR148  8W  used 
in  the  numerical  calculations  There  are  565  triangular  facets  in 
the  upper  surface  of  the  model,  the  lower  boundary  is  a  plane 
located  on  the  425()-m  contour 

1982).  At  the  latitude  of  the  survey,  diurnal  variations  are 
of  sufficiently  small  amplitude  that  they  may  safely  be 
ignored.  A  subset  of  131  anomaly  values  as  shown  in  Fig¬ 
ure  5  was  chosen  to  be  the  data  for  inversion  purposes 
Notice  that  no  interpolation  or  estimation  from  contour 
maps  was  required;  only  magnetic  values  from  the  original 
survey  were  used  along  with  corrected  positions  from  the 
Sea  Beam  map.  Also  analyzed  were  magnetic  readings 
taken  during  the  ship's  passage  between  seamounts  of  the 
chain.  Seafloor  spreading  anomalies  were  prominent,  but 
LR148.8W  does  not  straddle  a  reversal  boundary;  it  lies 
on  reversely  magnetized  crust  between  anomalies  26  and 
24  [ Lonsdale ,  1986]  The  magnetic  field  fluctuations  in  the 
vicinity  were  found  to  have  an  rms  amplitude  of  32  nT; 
the  power  spectrum  had  the  expected  exp(—  4rr Ar,,)  form 
for  a  field  of  crustal  origin,  but  the  amplitude  is  much 
higher  than  can  be  attributed  to  bottom  roughness  (see 
section  3)  and  a  reasonable  uniform  magnetization,  and 
we  infer  that  the  observed  field  is  due  primarily  to  local 
variations  in  the  intensity  of  the  re  ersely  magnetized 
crust  We  chose  30  nT  as  the  level  of  misfit  for  model 
predictions;  notice  that  (he  p^  magnetic  anomaly  of 
LR148.8W  is  over  1200  nT 

The  calculations  of  the  Gram  matrix  I  and  the  Green 
matrix  A  were  carried  out  as  described  using  the  Cl(,  and 
C.v  cubature  rules.  The  construction  of  the  Gram  matrix 
is  by  far  the  most  time-consuming  part  of  all  the  computa¬ 
tions;  on  the  Crav  XM/P-48  at  the  National  Science  Foun¬ 
dation  San  Diego  Supercomputer  Center,  the  C:<  calcula¬ 
tion  took  10  min,  which  would  translate  into  several  days 


Fig  4  View  of  the  model  seamount  as  seen  by  an  observer  to 
the  south  looking  20"  west  of  north  and  downward  10°  The  vert¬ 
ical  scale  has  been  exaggerated  by  a  factor  of  5. 
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on  a  lesser  machine.  To  check  the  accuracy  of  the  numer¬ 
ical  cubature,  many  of  the  subsequent  calculations  were 
performed  with  the  matrices  found  by  the  two  rules;  no 
significant  differences  were  observed,  from  which  we  con¬ 
clude  that  the  C|„  rule  was  satisfactory. 

The  first  model  calculation  to  be  performed  was  the 
traditional  least  squares  fitting  of  a  uniform  model  to  the 
anomaly  (with  the  additional  free  parameters  for  slope  and 
offset  of  the  ambient  field);  in  our  notation  this  is  just  the 
least  squares  problem 

A  p  —  d 

which  we  solved  in  the  usual  way  by  QR  factorization 
[Golub  and  Van  Loan,  1983,  chap.  6],  As  we  have  already- 
remarked.  the  field  predicted  by  a  uniform  magnetization 
(Figure  7)  is  a  very  poor  approximation  to  the  measured 
one.  The  rms  misfit  between  the  observed  anomaly  and 
the  model  field  is  269  nT.  nearly  10  times  the  crustal 
background  noise  of  30  nT.  Therefore  little  confidence  is 
to  be  placed  in  the  associated  VGP  location  of  57.8°N, 
118.9°W,  which  is  far  from  the  north  geographic  pole 
The  intensity  of  magnetization  of  the  uniform  model  is 
3.7  A  m  '.a  perfectly  satisfactory  figure. 

We  turn  next  to  the  most  nearly  uniform  solution 
obtained  by  seminorm  minimization.  The  calculations 
involve  the  solution  of  the  linear  system  (equation  (201) 
in  section  2;  recall  that  the  multiplier  /a  is  unknown  and 
must  be  chosen  to  yield  a  model  with  the  desired  misfit. 
Rather  than  single-mindedly  exhibiting  one  solution  with 
the  desired  30-nT  misfit,  we  swept  through  a  large  range 
of  multipliers,  calculating  the  VGP  position  and  misfit  for 
each  This  is  not  a  computationally  expensive  procedure, 
and  it  is  instructive.  We  saw  in  section  2  that  very  small 
values  of  /a  cause  the  solution  to  approach  the  least 
squares  solution  with  its  large  misfit  of  269  nT,  while  large 
values  of  the  multiplier  lead  to  misfits  approaching  zero; 
this  behavior  is  apparent  in  Figure  10a.  At  the  preferred 
misfit  of  30  nT  the  VGP  location  for  the  uniform  com¬ 
ponent  of  the  solution,  Li.  is  83.0°N.  47.2°W.  remarkably 
close  to  the  geographic  north  pole.  The  position  of  the 
VGP  is  quite  insensitive  to  misfit  over  a  considerable 
range:  it  moves  less  than  4°  as  ihe  misfit  varies  from  5  nT 
to  100  nT.  This  is  illustrated  in  Figure  106  where  the  tra¬ 
jectory  of  the  pole  is  plotted  on  a  map  of  the  polar  region. 
Such  stability  is  highly  desirable  because  our  estimate  of 
the  uncertainty  in  the  magnetic  measurements  is  not  pre¬ 
cise.  The  most  nearly  uniform  model  corresponding  to 
the  30-nT  misfit  has  an  overall  rms  intensity  of  magnetiza¬ 
tion  of  6.22  Am1.  The  nonuniform  component  of  the 
solution  accounts  for  only  0.78  A  m  1  in  the  sense  of  the 
norm.  (Recall  that  the  uniform  part  L  and  the  nonuni¬ 
form  portion  R  of  the  model  are  orthogonal  elements  of 
P.)  Thus  only  about  13%  nonuniform  magnetization  is 
needed  to  reduce  the  misfit  from  the  269  nT  of  the  best¬ 
fitting  purely  uniform  model  to  30  nT  Although  the 
nonuniform  part  of  the  model  is  small  in  its  contribution 
to  the  overall  magnetization,  its  presence  is  a  decisive  fac¬ 
tor  in  obtaining  good  agreement  with  the  data,  in  Figure 
1 1  we  attempt  to  display  the  internal  magnetization  vector 
of  our  preferred  solution  the  seamount  is  cut  in  three 
horizontal  planes,  and  in  each  section  we  draw  arrows, 
whose  sizes  and  orientations  represent  the  magnetization 
distribution  We  see  that  the  bulk  of  the  model  is  nor- 


250 


r- 

e  200 


150 


t  1 


00 


tt;  50 


(a) 


0  L_ 


/.cast  iquarti  Jit 


10  6  )0  4  10 
Multiplier  p 


10 0 


Fig  10  Behavior  of  t he  solutions  as  the  parameter  p  varies  (ui 
Rms  misfit  of  computed  magnetic  anomaly  lo  observed  values  as 
a  function  of  John  multiplier  p  A  misfit  of  30  nT  is  considered 
appropriate  lb)  Pole  paih  o  ihe  different  solutions  shown  in  a 
Lambert  equal-area  projection  of  Ihe  [solar  region.  Open  circles 
are  poles  wilh  rms  misfits  ol  2(XI.  150.  and  100  nT  from  left  lo 
right  The  slar  is  ihe  pole  of  Ihe  30-nT  misfit  solution 


mally  magnetized  and  resembles  l  with  its  dip  of  65  2" 
and  declination  of -4.8°;  there  is  no  reversed  material  in 
these  sections  It  is  the  nature  of  our  solution  that  the 
magnetization  is  more  irregular  at  those  points  nearest  the 
observations,  that  is.  the  points  on  the  upper  surface  of 
the  seamount,  and  this  is  evident  in  Figure  1 1  Therefore 
near  the  peak  there  probably  will  be  small  pockets  of 
reversed  magnetization  because  of  the  greater  fluctuation 
of  the  solution  on  the  boundary  Little  credence  should 
be  given  to  such  features  even  though  a  seamount  formed 
during  the  Cenozoic  might  be  expected  to  contain  normal 
and  reversed  material.  It  is  not  possible  to  prove 
rigorously  from  the  external  field  data  that  reversed 
material  is  present;  this  follows  from  the  fact  that  any 
finite  magnetic  data  set  can  be  satisfied  by  a  body  normally 
magnetized  in  its  entirety,  but  we  shall  not  go  into  the 
proof  here 

How  accurate  is  the  paleopole  of  the  mean  magnetiza¬ 
tion?  In  the  previous  section  we  described  a  theory  that 
was  capable  in  principle  of  providing  the  answer  if  we  are 
prepared  to  supply  an  upper  limit  on  the  rms  magnetiza¬ 
tion  lo  be  found  in  marine  basalts  The  principle  ol  the 
method  is  to  construct  the  worst  possible  case  the  model 
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Fig  II  Magnetization  lectors  in  three  hon/ontal  sections  ol  the 
seamount  at  bathymetry  contour  levels  1000  m.  2000  m.  anil 
3000  m  The  solution  illustrated  is  the  most  uniform  model  with 
mis  misfit  to  the  data  ol  30  nT  The  length  ol  the  arrow  depicts 
the  strength  of  magnetization  and  the  orientation  can  he  interred 
trom  the  aspect  of  the  conical  head 


that  lies  as  far  as  possible  front  the  preferred  solution 
while  lying  within  the  constraints  set  by  the  the  data  and 
an  upper  limit  on  the  norm  of  \t.  or  equivalently,  the  rnts 
magnetization  We  could  embark  upon  the  search  over 
the  a.  m  plane  seeking  stationary  values  of  h  in  (331.  but 
in  the  present  example  this  is  unnecessary  because  we  can 
find  a  model  obeying  the  constraints  with  a  completely 
different  V'GP  position,  in  other  words  we  need  not  seek 
the  worst  case  In  section  2  we  discussed  the  problem  of 
finding  an  element  in  P  nearest  to  a  fixed  element  l  .  if 
the  fixed  element  is  0.  the  zero  element,  we  will  find  the 
smallest  model  of  P  fitting  the  data  The  magnetization  of 
this  smallest  model  is,  from  (10).  a  linear  combination  of 
the  representers  G,  When  this  theory  is  adjusted  to 
include  misfit  and  the  three  parameters  for  the  ambient 
field,  it  becomes  an  example  of  seminorm  minimization 
on  P'.  and  the  necessary  matrix  elements  are  already 
available.  Arranging^  to  give  the  expected  30  nT  misfit, 
we  find  an  rms  magnetization  averaged  over  the  seamount 
of  only  1.70  A  m  '.  and  the  pole  position  of  the  projection 
of  M  onto  the  subspace  of  uniform  elements  is  60  l°N. 
34.4°E.  Obviously  this  model  has  a  small  enough  norm 
and  a  good  fit  but  its  pole  lies  30°  from  that  of  the  most 
uniform  model,  there  Is  no  doubt  that  by  allowing  a  larger 
norm,  solutions  with  poles  even  further  away  must  exist 
Is  it  really  true  that  the  uncertainty  in  the  uniform  V'GP 
is  so  large''  The  fact  that  the  direction  is  very  dose  to 
what  one  might  expect  on  geological  grounds  encourages 
us  to  believe  otherwise,  it  must  be  remembered  that  we 
have  approached  the  problem  of  uncertainty  by  estimating 


an  upper  bound  and  that  the  true  error  could  be  far 
smaller  than  that  bound  An  explanation  for  the  generos¬ 
ity  of  the  error  bound  in  the  theory  of  this  paper  may  be 
traced  to  a  property  of  the  norm  of  P  although  the  rms 
magnetization  of  the  minimum  norm  model  is  only 
I  70  A  m  we  calculate  that  at  some  points  within  the 
model  the  local  magnetization  intensity  rises  to  nearly 
1500  A  m  a  totally  unacceptable  value  Thus  a  reason¬ 
able  rms  magnetization  is  by  itself  no  guarantee  of  a  plau¬ 
sible  magnetization  model  Perhaps  a  more  suitable  res¬ 
triction  would  be  that  the  magnetization  at  any  point  in  ( 
should  not  exceed  a  prescribed  limit .  say  20  A  m  1  This 
is  a  kind  of  uniform  norm,  but  uncertainty  estimates 
based  upon  it  would  be  much  more  complicated  than 
those  undertaken  in  this  paper;  the  idea  is  probably  worth 
pursuing,  but  we  do  not  take  it  up  here. 

Far  from  the  seamount  its  magnetic  field  closely  resem¬ 
bles  that  of  a  point  dipole,  the  influence  of  higher  mul- 
tipole  terms  having  fallen  away  more  rapidly  than  that  of 
the  leading  term  An  analysis  of  three  suitable  field  mea¬ 
surements  can  yield  the  vector  moment  of  a  dipole  But 
the  dipole  moment  divided  by  the  volume  is  just  the  mean 
magnetization  of  the  seamount.  Another  way  of  looking 
at  this  question  was  given  by  Parker  [1971]  we  imagine 
trying  to  construct  the  linear  functional  for  X,,  say  from  a 
linear  combination  of  the  representers  G/ .  The 
representers  for  distant  observers  are  much  smoother  and 
more  nearly  constant,  so  that  they  are  much  more  valu¬ 
able  to  the  approximation  This  discussion  ignores  the 
problem  of  noise  in  the  observations,  while  the  field 
values  far  from  the  seamount  have  a  more  direct  connec¬ 
tion  with  the  average  magnetization,  their  relative  accuracy 
is  much  less  than  those  nearer  the  body  because  the  noise 
fields  do  not  drop  off  with  distance  from  the  seamount 
while  the  signal  does.  There  must  be  an  optimal  distance 
from  the  seamount  at  which  the  field  values  yield  a  max¬ 
imum  amount  of  information  about  the  L; .  did  the  survey 
of  LR148  8W’  extend  to  that  optimum  distance11  We  have 
reason  to  believe  that  it  did  not  A  certain  amount  of 
manipulation  of  (20)  yields  the  following  informative 
expression  for  fi,  the  vector  of  coefficients  for  L’  in  the 
basis  Xi  : 
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fig  12  Dimensionless  weight  function  expressing  the  influence 
of  each  field  measurement  upon  (he  estimation  of  the  average 
magnetization  The  small  levels  in  ihe  central  region  indicate  that 
m.ignctn  measurements  there  have  little  effect  on  l 
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ti  =  ( A'u/n  +  D  'A)  ;(//M  +  n  'd 

=  (/x,,/ )w'  J 

where  »  is  an  V  x  6  matrix  (remember  the  inclusion  of 
the  three  additional  free  parameters  'escribing  the  back¬ 
ground  field);  the  factor  ixtJAn  renders  h  dimensionless. 
This  equation  shows  how  every  magnetic  field  measure¬ 
ment  contributes  to  each  coefficient  of  the  uniform  com¬ 
ponent  of  the  solution.  Column  I  of  r  can  be  interpreted 
as  a  weight  vector  in  an  equation  of  the  form 

ji\  —  lp.o/An)n(cl 

When  R* |  is  mapped  we  see  which  regions  of  the  survey 
affect  our  determination  of  /3 1  by  the  si/e  of  the  weight 
function  The  average  magnetization  is  insensitive  to  the 
magnetic  anomaly  in  places  where  w  is  small,  and  con¬ 
versely.  it  depends  heavily  on  data  where  w  is  large 
(This  interpretation  is  not  strictly  valid  because  if  a 
different  data  value  appeared  at  any  point,  the  size  of  n 
would  have  to  be  adjusted  to  retain  the  desired  misfit, 
thereby  changing  m  .  if  the  putative  changes  are  small,  so 
will  the  perturbation  to  jc  )  As  a  synopsis  of  the  influence 
on  all  of  l  we  have  contoured  -4-  i**n  •*  »  >  in  Fig¬ 
ure  12.  We  discover  that  our  knowledge  of  l  is  deter¬ 
mined  for  the  most  part  by  values  at  the  edge  of  the  sur¬ 
vey  If  36  data  are  removed  in  a  15-kni  disk  at  the  center 
of  the  magnetic  anomaly,  thus  obliterating  the  central 
maximum  and  the  negative  patch  to  the  south  of  it.  the 
uniform  part  of  the  new  solution  has  a  pole  position  only 
6°  away  from  the  one  obtained  with  all  the  data.  This 
confirms  the  relative  lack  of  importance  of  the  central 
region  for  the  purposes  of  obtaining  an  estimate  of  the 
uniform  component  The  wav  in  which  the  weight  func¬ 
tion  grows  toward  the  edge  of  the  map  strongly  suggests 
(hat  the  optimum  distance  for  recovering  information 
about  the  average  magnetization  probably  lies  beyond  the 
boundary  of  the  present  survey,  and  it  must  be  concluded 
that  to  obtain  the  best  estimate  of  the  average  magnetiza¬ 
tion,  the  area  should  be  extended  considerably  We  antici¬ 
pate  that  with  a  more  nearly  optimal  distribution  of  mag¬ 
netic  observations  the  error  will  bound  significantly 
smaller. 

LRI48.8W  is  a  challenging  seamount  for  analysis:  the 
magnetic  anomaly  is  complex  and  obviously  incompatible 
with  the  assumption  of  a  uniform  interior  The  fact  that 
we  are  able  to  recover  a  reasonable  VGP  and  that  the 
minimum  departures  from  perfect  uniformity  are  very 
mild  gives  us  encouragement  as  we  contemplate  a  wider 
application  of  the  technique  Our  seamount  is  an  unusu¬ 
ally  large  body,  nearly  1600  km'  in  volume,  and  it  comes 
within  500  m  of  the  ocean  surface  These  two  properties 
are  major  factors  in  determining  the  computing  time 
needed  for  our  approach,  particularly  when  it  is  recalled 
from  section  3  that  the  size  of  each  triangular  facet  is 
governed  by  the  local  water  depth  The  majority  of 
seamounts  would  make  much  more  modest  computational 
demands. 

6.  Discission 

We  have  presented  a  method  using  linear  inverse  theory 
to  construct  an  internal  magnetization  function  for 
seamounts  based  upon  their  magnetic  anomalies  and 


shapes.  We  have  shown  why  it  is  fundamentally  impossi¬ 
ble  to  deduce  the  true  magnetization  from  external  field 
data  no  matter  how  precise  or  complete  it  is.  There  ore 
our  model  has  been  selected  to  correspond  as  dosel*  s 
possible  to  the  simplest  structure,  the  uniformly  magne¬ 
tized  body.  Approximations  to  this  ideal  are  expected  on 
geological  grounds  if  the  seamount  formed  rapidly  or  dur¬ 
ing  a  period  of  single  polarity  of  the  main  geomagnetic- 
field.  Nonetheless,  extensive  modeling  of  actual 
seamounts  has  shown  that  the  uniform  model  by  itself 
rarely  gives  an  accurate  account  of  the  magnetic  anomaly 
so  that  significant  nonuniformity  is  certainly  present. 
Application  of  the  method  to  a  young  seamount  in  the 
Louisville  Ridge  chain  yields  a  magnetization  accounting 
accurately  for  the  magnetic  anomaly,  we  find  that  in  the 
measure  of  the  norm,  only  13%  nonuniformity  is  required 
to  obtain  the  good  agreement,  even  though  the  observed 
anomaly  is  complex  and  poorly  approximated  by  the  field 
of  a  uniform  body.  This  result  suggests  that  the  picture  of 
an  essentially  uniform  seamount  may  not  be  so  inaccurate 
after  all.  but  since  even  quite  small  amounts  of  hetero¬ 
geneity  have  a  disproportionately  large  influence  on  the 
form  of  the  magnetic  anomaly  ,  this  has  been  impossible  to 
appreciate  until  now.  We  predict  that  relatively  small  pro¬ 
portions  of  nonuniformity  will  be  needed  in  all  the 
seamounts  with  simple  magnetic  anomalies. 

The  mean  magnetization  is  a  property  of  a  seamount 
that  can  in  principle  be  obtained  from  the  magnetic  anom¬ 
aly  alone  The  direction  of  the  mean  magnetization  vector 
is  diagnostic  of  the  paleomagnetic  field  averaged  over  the 
period  of  formation  of  the  body,  and  it  is  the  most  valu¬ 
able  information  about  the  seamount  for  tectonic  studies 
In  our  example  we  find  that  the  mean  magnetization  vec¬ 
tor  predicts  a  paleopole  very  close  to  the  north  geographic- 
pole  in  agreement  with  our  expectations;  this  is  in  contrast 
to  the  pole  position  of  the  best-fitting  uniform  model, 
which  lies  30°  away  We  have  developed  a  theory  for  the 
uncertainty  in  the  estimate  of  the  mean  magnetization 
requiring  an  upper  limit  on  the  overall  rms  magnetization 
allowed  in  the  volcanic  rocks;  sampling  and  knowledge  of 
rock  magnetism  puts  such  estimates  on  a  secure  footing 
Unfortunately,  the  results  for  the  Louisville  Ridge 
seamount  are  disappointing:  the  uncertainty  is  so  large  as 
to  give  the  impression  that  the  calculated  paleopole  posi¬ 
tion  is  unreliable.  The  excellent  location  of  the  model 
pole  leads  us  to  believe  that  the  true  error  is  much  smaller 
and  that  a  better  theory  is  needed  for  its  estimation  We 
show  that  significant  improvements  in  the  uncertainty  esti¬ 
mates  can  be  expected  to  follow  from  a  more  extensive 
survey,  since  it  is  apparent  from  our  calculations  that  data 
taken  at  a  surprisingly  great  distance  from  the  magnetic- 
sources  contain  substantially  more  information  about  the 
mean  magnetization  than  those  closer  to  them  We  would 
not  recommend  dispensing  with  coverage  over  the  central 
region  in  future  surveys,  however,  because  the  shape  of 
the  seamount  must  still  be  known  in  detail  if  its  magnetic 
field  is  to  be  properly  analyzed  It  is  to  be  hoped  of  course 
that  progress  on  the  theoretical  front  will  obviate  the 
necessity  of  resurveying  every  seamount.  In  fact  we  may 
anticipate  here  a  refinement  in  the  theory  of  error  estima¬ 
tion.  recent  work  1  Parker.  1987]  relying  upon  a  plausible 
statistical  characterization  of  the  magnetic  nonuniformities 
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promises  to  provide  the  basis  for  an  alternative  and  more 
powerful  theory  for  the  uncertainties. 

Appendix  A:  Linear  Independence 
of  the  Representers 

The  linear  independence  of  the  representers  in  (2) 
guarantees  the  positive  definiteness  of  the  Gram  matrix, 
an  essential  property  for  a  number  of  results  in  the  paper. 
For  any  particular  seamount  and  set  of  observer  positions 
one  could  in  principle  verify  that  T  had  a  positive  deter¬ 
minant,  but  it  is  more  satisfactory  if  this  fact  can  be 
shown  in  general  as  we  shall  now  proceed  to  do.  We  fol¬ 
low  the  usual  path  of  investigating  the  consequences  of 
assuming  that  the  functions  are  linearly  dependent. 
Linear  dependence  would  imply  the  existence  of  a  set  of 
constants  y,  not  all  zero,  such  that 
\ 

O-Iy.Gfr  ,  s ) 

1 

for  every  points  €  F.  From  (2)  we  see  this  is  equivalent 
to 

0=6„'VVflsl  (Al) 

where 

F<s)  =  £  X— .  (A2) 

ri  lr,  -  s| 

For  this  appendix  only,  let  us  erect  a  local  Cartesian  axis 
system  with  the  :  direction  aligned  with  6„.  Then 
6„  •  V  =  B/d: .  It  is  elementary  that  the  only  solutions  to 
(Al)  are  of  the  form 

F  (.v ,  v ,  :  )«*  c  |  r  +  i ;  +  ,/  (v .  y ) 

where  cy  and  c2  are  constants  and  /  is  an  arbitrary  con¬ 
tinuously  differentiable  f  unction  of  .x  and  i  Thus  linear 
dependence  implies  the  existence  of  coefficients  y  such 
that 

'  v 

C  ,r  +  t  '  +  /  (.V  ■  V )  =  y.  - - - r  (A3) 

7?!  |r.  -  si 

for  U .  y .  r )  =  s  €  F.  The  functions  1/ lr  -si  are  all 
analytic  in  s  inside  F,  so  that  if  those  coefficients  exist, 
(Al)  remains  valid  by  analytic  continuation  of  the  indivi¬ 
dual  components  of  the  vector  everywhere  outside  F. 
except  right  at  the  singularities  s  =  r,  Therefore  the  pro¬ 
perty  (A3)  may  be  extended  outside  F  in  the  same  way. 

Consider  a  sphere  centered  at  the  origin  of  coordinates 
and  enclosing  all  the  observation  positions;  its  radius  R  ,  is 
greater  than  max|r,  |.  We  evaluate  F  using  (A2)  at  a  posi¬ 
tion  vector  s  outside  this  sphere  such  that  Is  I  >  2/?,.  it 
follows  that 

"  Is  I  -  /?  i 
_ _ 

<  Is  | 

where  is  the  largest  of  all  the  magnitudes  of  the  y,. 
Thus  by  choosing  Is  I  large  enough  we  can  make  If  I  as 


small  as  we  please.  If  s=  (.x ,  y ,  c )  and  z  increases 
without  bound  while  .x  and  >•  are  fixed,  F  must  tend  to 
zero.  Thus  cy  must  be  zero,  or  otherwise  the  left  side  of 
(A3)  would  grow  in  magnitude  in  the  limit  of  large  z .  But 
if  cy  vanishes,  the  left  side  of  (A3)  no  longer  depends  on 
z.  Therefore  the  cy  +  /  (x ,  y )  must  vanish  identically  for 
every  x  and  y,  if  F  is  to  tend  to  zero  when  z  tends  to 
infinity.  In  other  words,  the  left  side  of  (A3)  is  zero. 

Thus  equation  (A3)  has  become 


0=1 

j-i 


yj 

lr,  -  s  I 


(A4) 


From  this  we  show  that  all  the  y  coefficients  vanish  too. 
Renumber  the  coordinates  and  coefficients  so  that  yi  is 
the  coefficient  of  largest  magnitude;  also  consider  a  posi¬ 
tion  s  so  that  |r  i  —  s  |  =*  «  .  Then  from  ( A4) 


Now  let  R:  be  the  smallest  of  all  the  interobserver  dis¬ 
tances  [r ,  -  iy  I;  if  t  is  chosen  to  be  smaller  than  R:/2  it 
is  easily  verified  that 


or 


y  m."  ^  y  2ly,  | 

«  "  R: 


(A5) 


<  t  C 

where  C  is  some  constant  independent  of  t  Since  we 
may  choose  c  to  be  as  small  as  we  please,  this  means  that 
the  y ,  largest  in  magnitude  must  vanish;  so  then  they  all 
must.  This  contradicts  the  original  assertion  that  not  all  of 
them  could  be  zero,  and  therefore  we  must  conclude  that 
the  representers  are  not  linearly  dependent 

Notice  that  the  proof  fails,  as  it  should,  if  two  of  the 
observer  positions  are  in  fact  identical;  then  R ;  would 
vanish,  and  (A5)  would  not  be  legitimate. 

Appendix  B:  Approximation  oi  Terrain 
by  Triangular  Facets 

When  the  power  spectral  density  of  the  bathymetry  is 
known,  it  is  possible  to  estimate  the  probable  error  com¬ 
mitted  by  replacing  the  true  surface  by  a  plane  triangle 
that  interpolates  the  bathymetric  values  at  its  corners 
Although  to  our  knowledge  no  spectral  studies  exist  for 
the  surface  of  an  actual  seamount,  it  seems  plausible  to 
assume  that  on  scales  much  less  than  the  diameter  of  the 
seamount,  the  surface  roughness  is  approximately  the 
same  as  that  of  very  young  oceanic  seafloor  or  of  terres¬ 
trial  lava  flows,  and  for  both  of  these,  quantitative  analysis 
is  available.  We  begin  with  data  presented  by  Fox  and 
Hayes  [1985]  on  the  spectrum  of  bathymetric  profiles;  one 
of  their  spectra  describes  the  Gorda  Rise,  and  we  shall  use 
the  parameters  estimated  for  this  region 

We  find  it  most  convenient  to  develop  the  theory  using 
the  autocorrelation  function  of  the  topography.  The 
(two-dimensional)  autocorrelation  function,  ft(.Y).  is 
defined  by  the  expectation  of  the  product  of  two  samples 
of  the  topography  taken  at  positions  V  and  V  +  Y  (we  use 
Y  rather  than  x  to  stress  the  fact  that  the  vectors  give 
positions  in  the  plane),  and  this  is  related  to  the  power 
spectral  density  through  the  Fourier  transform; 
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R  (y)  =  E[/»  <jr)h  (7  +  7)1  (Bl) 

=  J*  SUt )  exp(2rriA  ■  S’)  d:k 

R~ 

-//  S (A, ,  A, )  exp(2jri[A, .v  +  A,  v ) >  (/A,  ilk,  (B2) 

where  S(k )  =  5  (A, ,  A, )  is  the  wave  number  power  spec¬ 
tral  density  of  the  topography.  In  practice  it  is  found  that 
the  spectra  are  fairly  isotropic,  so  that  S  depends  only  on 
|A|  and  not  on  the  direction  of  k:  it  follows  that  R  is  a 
function  independent  of  the  direction  of  its  argument  vec¬ 
tor.  We  denote  the  isotropic  spectral  density  by  •S'llA'Il 
and  the  isotropic  autocorrelation  function  by  fl(|"r|], 
admitting  the  slight  risk  of  confusion  from  this  notation. 
Fox  and  Hayes  do  not  provide  power  spectral  densities 
Slk)  or  isotropic  spectral  functions  5||A|),  but  give 
instead  the  one-dimensional  spectral  amplitudes  of 
profiles,  functions  we  denote  by  A  ,  (A\  I  Now 

A  i  (A, ^  SCk)  dk,  (B3) 

which  is  just  the  two-dimensional  power  spectrum 
integrated  in  a  wave  number  direction  normal  to  that  of 
the  profile  [see  Slwre  and  Parker,  1981 1,  since  all  direc¬ 
tions  are  equivalent,  any  constant  direction  may  be 
chosen.  If  we  evaluate  (B2)  for.i  =  0  we  see  from  (B3) 
that 

R  (v ,  0)  =  §  Aiik ,  )-  exp(2rriA,.v)  dk, 

=  R  [,v  1  (B4> 

Thus  the  isotropic  autocorrelation  function  is  just  the 
one-dimensional  inverse  Fourier  transform  of  the  profile 
power  spectral  density. 

Fox  and  Hayes  find  for  many  kinds  of  terrain  a  power 
law  holds  quite  well:  we  rewrite  their  result: 

4i<A, ):  =  c,  |/„A,  I  ’■  IB5I 

where  ln  is  an  arbitrary  constant  with  dimensions  of 
length  After  a  careful  interpretation  of  the  unusual  units 
used  in  the  paper,  we  calculate  that  for  the  young  volcanic 
terrain  of  the  Gorda  Rise,  c,  =  4800  m'  and  r?  =  2  48 
when  /„  is  set  to  I  km.  Equation  (B5)  cannot  be  substi¬ 
tuted  into  (B4)  in  a  classical  way  because  the  Fourier 
integral  diverges  owing  to  the  singularity  at  A,  =  0:  in  fact 
this  shows  us  that  the  extrapolation  of  the  spectrum  to  all 
wave  numbers  is  not  strictly  legal  since  it  possesses  infinite 
energy  caused  by  contributions  at  long  wavelengths  How¬ 
ever.  it  is  only  the  short  wavelengths  that  concern  us  here 
and  so  we  use  distribution  theory  to  evaluate  the  Fourier 
transform  [Gel'fand  and  Shilov,  1964,  p.  359]  . 

/Mr)-2(2wV  1  c ,  f  „  1  sin(7n)/2)r  (1— -rjUr//,,)’1  1 

=  c2(r//u)”  1  (B6I 

In  this  appendix,  F  is  the  familiar  special  function,  not  the 
Gram  matrix.  We  find  c2  =  -239  m\  It  is  at  first  aston¬ 
ishing  that  R  [0]  =  0  and  that  it  is  negative  elsewhere,  the 
explanation  is  that  the  generalized  Fourier  transform  has 


suppressed  an  infinite  constant  term.  The  true  autocorre¬ 
lation  function  has  a  large  but  unknown  value  at  zero,  but 
the  result  we  shall  obtain  (equation  ( B8 ) )  is  invariant 
under  addition  of  any  constant  to  R  so  that  our  ignorance 
of  the  constant  is  unimportant 

From  (B6)  we  may  calculate  the  isotropic  spectrum 
S'  [A  I  by  means  of  a  Hankel  transform  which  is  just  the 
inverse  transform  of  <B2)  expressed  in  polar  form 
[Bracewell.  19781: 

S(A[  =  J  Ju(2nkr )/?[/-)  2rrr  dr 

II 

=  c,(A/„)  '>r  "  (B7) 

where 

1(11  +  t)  1/2) 

ITT)  T((l  —  T)  )/  2) 

For  the  constants  we  are  using  c-,=  27.400  nr1  This 
result  will  be  useful  in  calculating  the  magnetic  effect  of 
the  roughness  in  the  terrain. 

Consider  now  a  triangular  region  T  defined  by  its 
corners  at  position  vectors  y,.  7;,  and  T,  which  lie  in  a 
horizontal  plane.  The  plane-interpolated  topography  for 
x’  €  T  is  found  from  the  values  of  the  corners  via  a  for¬ 
mula  of  the  kind 

\ 

h  <y )  =  ]£  h  <y,  )<t> ,  <y ) 

I- 1 

where  the  basis  functions.  d>,  are  each  1  at  y,  and  zero 
along  the  opposite  edge  of  the  triangle,  varying  linearly 
over  T  We  can  find  the  variance  of  the  interpolation  via 
the  expectation,  assuming  as  usual  that  h  has  zero  mean. 
Define 

fi-’  =  El  (A  (y )  -  h  (7))-] 

Then,  using  the  definition  of  the  autocorrelation  function 
(Bl)  and  performing  a  certain  amount  of  algebra,  we 
obtain 

fi:  =  RloKl  +  X  <t>  <-?>-'>  +  L  2/?(|:v  -yl  ll<6  (:v )</h  U’ I 
-  £  2R  II.V—  C.  |1<A  (,V) 

Next  we  average  over  the  triangle  T 

ft  -'  ,  -  f  fi  :  </  •  ,v/A 

Ji 

where  A  is  the  area  of  the  triangle.  Then,  performing  the 
integrals  involving  the  basis  functions  we  find 

-  ;  R  [01  +  4 1  Rll.v’  -.VJI 

O 

-  ~ I  jf  2R !  |y-y,  |  )d>  <y )  </-'y  < B8 ) 

This  is  a  general  result,  not  depending  on  any  particular 
choice  of  autocorrelation  function,  except  that  isotropy  has 
been  assumed  In  substituting  (B6)  into  (B8)  it  is  con¬ 
venient  to  define  a  cyclic  extension  of  the  corner  vectors 
so  that  V,. ,  =  V  .  for  /  =  1.2  and  similarly  for  the  angles 
at  the  corners  associated  with  each  a’  .  which  will  be  called 
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Oj  .  We  wish  to  separate  the  effects  of  size  from  those  of 
triangle  shape,  and  so  we  introduce  normalized  side 
lengths,  referred  to  the  area  of  the  triangle: 
s;  =  Is1,. i  —  S’j.jl/A  .  With  all  this  we  find  after  some 
effort  that  (B8)  becomes 


;8 2)r  =  -c:  (A  //„)”'© (T)  <B9) 

where  ©(T)  is  a  shape  factor  for  the  triangle  given  by 
2r  ‘I'P  (Hj .  :)-V  (-«, .. , )  I  (s,  V 


©m  =  i 


(r)+  1 )  (rj+2)  (s,  )’r 


and  here  the  function  '1/  is  the  integral 

vpltf)  -  f  — 

sin’r  '<* 


(BIO) 


=  cost)  F(  1+t,/2.  1/2:  3/2;  cos->) 


where  F  is  Gauss'  hypergeometric  function;  power  series 
expansions  given  by  A  bra  mown;  and  Stegun  (1965,  chap. 
15)  are  quite  convenient  for  evaluating 'T  The  function 
W (T)  is  plotted  in  Figure  2  for  the  value  of  tj  =  2.48 
given  for  the  Gorda  Rise  by  Fox  and  Hayes  11985],  The 
smallest  value  is  attained  by  equilateral  triangles  with 
<-)(  T)  =  0.3198  and  only  extremely  scalene  triangles 
achieve  values  above  unity. 

The  Fox  and  Hayes  study  gives  spectral  estimates  up  to 
k,  =  5  km  or  a  wavelength  of  200  m.  it  would  be  help¬ 
ful  in  confirming  our  analysis  if  we  could  show  that  the 
same  power  law  extended  to  smaller  scales.  We  have 
compared  their  spectrum  with  one  found  for  a  subaerial 
lava  flow  in  Bomto  Arizona  [Jaeger  and  Sehurmg.  1906] 
The  largest  wavelength  estimated  in  this  spectrum  is  10  m. 
so  that  there  is,  unfortunately,  no  overlap  We  find  that 
the  marine  spectrum,  extrapolated  to  the  shorter  scale  is 
consistently  a  factor  of  10  smaller  in  power  It  is  likely 
this  discrepancy  results  from  the  fact  (hat  the  Bomto  flow 
consists  mostly  of  aa.  which  is  almost  certainly  much 
rougher  than  the  expected  surface  of  a  seamount  on  the 
small  scales 
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In  this  appendix  we  describe  a  practical  means  for  solu¬ 
tion  of  (38)  and  (391  Certain  roots  of  these  equations 
give  rise  to  magnetizations  of  prescribed  norm  and  data 
misfit  that  are  as  far  as  possible  from  the  uniform  state  It 
is  possible  that  other  stationary  points  exist  to  the  optimi- 
zat’.m  problem  that  do  not  correspond  to  the  desired  max¬ 
imum.  so  that  a  fairly  complete  exploration  of  the  k.n 
pl  ane  should  be  performed  to  uncover  all  the  roots  For 
this  to  be  a  practical  proposition,  an  economical  method 
must  be  found  for  solving  (37),  the  large  system  of  linear 
equations  associated  with  each  point  in  the  plane 

The  fundamental  idea  is  the  possibility  of  finding  a  sin¬ 
gle  similarity  transform  simultaneously  mapping  two  sym¬ 
metric  matrices  into  diagonal  form  [Golub  and  Fan  Loan. 
1983,  chap  8]  First  we  solve  the  eigenvalue  problem  for 
the  matrix  5,  and  write  the  solution  as  the  spectral  factori¬ 
zation 


B ,  =  0,\Q! 

where  is  an  ,V|X /V,  orthogonal  matrix,  whose  columns 
are  the  eigenvectors  of  fl,,  and  A  is  the  diagonal  matrix  of 
eigenvalues.  Notice  that  all  the  eigenvalues  should  be 
positive  because  B\  is  positive  definite;  in  practice,  small 
errors  in  the  cubature  may  cause  this  to  be  untrue  for  the 
actual  numerical  array  and  so  it  is  necessary  to  regularize 
the  problem  by  adding  a  small  positive  constant  to  the 
diagonal  of  B,  or  by  taking  absolute  values  for  the  eigen¬ 
values.  When  this  is  done,  the  square  root  of  A  is  defined 
as  well  as  its  inverse;  we  denote  this  in  the  obvious  way 
Next  we  compute  a  singular  value  decomposition  [ Golub 
and  Van  Loan ,  19831:  let 

B:Q  ,A  =  Z;I0[ 

where  the  matrix  on  the  left  has  been  computed  from  its 
known  factors  and  those  on  the  right  are  the  singular 
value  decomposition  factors.  Z:  and  Q:  are  orthogonal 
arrays  of  the  appropriate  sizes.  £  is  an  .Yx.V,  matrix. 

I  -  ir  0,1 

where  I'  is  the  .Yx,V  diagonal  array  of  singular  values, 
and  Ox  is  a  ,Vx3  array  of  zeros.  Notice  that  these  two 
computationally  expensive  factorizations  need  be  done 
only  once  per  seamount.  Now  define  the  3x.Y,  matrix  B, 
by 

B,  -  B0QX.\  Q: 

When  these  factorizations  are  substituted  into  (37)  we 
obtain 


‘  =  Q\ A  Q:  c 

where 

c  -  (-B's„  +  kl  +  nVZ)  Q]  A  Q(  (yiBld  -  Bkii) 

=  (-B'B„  +  kl  +  m£'I>  1 

<M<?<\  QlBld  +  Ql\  Q[Blfi) 

=  (-s'fl„+  kl  +  mI'I)  V'l  -  >'.')  (CII 

Observe  that  the  vectors  v,  and  i  -  do  not  depend  on  k  or 
H  Equation  (Cl)  is  a  transformed  version  of  (37).  The 
f  unctions  f  \  and  ! :  are  simply  expressed  in  terms  of  the 
vector  c.  which  becomes  the  working  variable  during  the 
heavy  calculations: 

./  i  -  Hcll;  -  -VfLv 
/;=  III?  -  S; 

In  (Cl)  we  still  must  solve  an  ,V|X.V,  linear  system  for 
each  k . /x  pair,  the  key  to  the  efficient  realization  of  this 
process  is  the  fact  that  Bn  is  not  a  square  array  but  only 
3x.V.  and  that  kl  +  nZ'l.  is  diagonal.  We  denote  this 
diagonal  matrix  by  Dik.ix)  and  apply  the  following  matrix 
identity,  called  the  Sherman-Morrison-Woodbury  formula 
1  Golub  and  Van  Loan.  1983.  p  31 

U  +  VW  )  ;  =  A  :  -  A  '(/</+  H  'A  '(,•)'  W'  A  1 

where  A  is  any  square  invertible  matrix.  U  and  IF  are  of 
the  proper  sizes,  but  not  necessarily  square,  and  the  left 
side  must  exist  If  we  choose  A  and  L  = 

-  W  =  then 


Pakki'k  ft  m  Invfrsf  Tiiiurv  xmiSixmoini  Mm;m  i ism 


+  kl  +  n  I'D  1 

«  D  1  +  D  '&!,  (/  -  fl„Z)  'B,'  )  'fl„D  1  <C2) 

The  reason  that  this  rearrangement  represents  a  massive 
computational  saving  is  that  the  matrix  inversions  in  the 
expression  are  either  of  the  diagonal  matrix  D(k.gi)  and 
hence  trivial  or  of  the  matrix  U  -  B,J)  'B!,  )  which  is 
only  a  3x3  symmetric  array.  When  .V,  is  large,  the 
number  of  computer  operations  to  evaluate  c  using  (C2> 
and  (Cl)  is  approximately  1 3.V t ;  with  V,  =  150.  this 
represents  a  factor  of  nearly  300  improvement  over  the 
equivalent  calculation  using  (37). 
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We  present  a  multitaper  algorithm  to  estimate  the  polarization  of  particle  motion  as  a  function  of 
frequency  from  three-componeni  seismic  data  This  algorithm  is  based  on  a  singular  value  decom¬ 
position  of  a  matrix  of  eigenspectr'a  at  a  given  frequency.  The  right  complex  eigenvector  f 
corresonding  to  the  largest  singular  value  of  the  matrix  has  the  same  direction  as  the  dominant 
polarization  of  seismic  motion  at  that  frequency  The  elements  of  the  polarization  vector  r  specify 
the  relative  amplitudes  and  phases  of  motion  measured  along  the  recorded  components  within  a 
chosen  'equency  band.  The  width  of  this  frequency  band  is  determined  by  the  time-bandwidth 
product  of  the  prolate  spheroidal  tapers  used  in  the  analysis  We  manipulate  the  components  of : 
to  determine  the  apparent  azimuth  and  angle  of  incidence  of  seismic  motion  as  a  function  of  fre¬ 
quency.  The  orthogonality  of  the  eigentapers  allows  one  to  calculate  easily  uncertainties  in  the 
estimated  azimuth  and  angle  of  incidence  We  apply  this  algorithm  to  data  from  the  Anza  Seismic 
Telemetered  Array  in  the  frequency  band  OC  /  •£  30  Hz  The  polarization  is  noi  always  a  smooth 
function  of  frequency  and  can  exhibit  sharp  jumps,  suggesting  the  existence  of  scattered  modes 
within  the  crustal  waveguide  and/or  receiver  site  resonances 


1.  Introduction 

The  polarization  of  particle  motion  as  measured  by  a 
three-component  seismometer  has  been  studied  by  a 
number  of  straightforward  methods,  most  simply  by  trac¬ 
ing  the  projection  of  the  motion  as  a  function  of  lime  onto 
a  chosen  plane  of  reference.  Although  useful  to  illustrate 
the  particle  motion  of  simple  arrivals,  this  practice  is  quali¬ 
tative  and  less  useful  with  complicated  signals. 

The  problem  of  extracting  a  particular  type  of  wave 
(e  g..  P.  SH.  Rayleigh)  from  a  noisy  background  has  been 
studied  by  correlation  techniques  and  special  fillers  [e.g.. 
Kanasewich.  1981;  Archambeau  and  Flinn.  1965;  Uidale. 
1986],  Most  of  these  techniques  are  designed  for  time 
domain  analysis  and  implicitly  assume  that  the  waveform 
has  essentially  the  same  polarization  over  all  or  most  fre¬ 
quencies.  Samson  [1977,  1983a,b,c]  describes  a  method  of 
estimating  the  polarization  as  a  function  of  frequency 
This  is  important  for  the  analysis  of  seismic  records  The 
seismic  waveforms  of  local  and  regional  distance  events 
are  often  superpositions  of  direct,  refracted,  reflected,  and 
scattered  waves,  with  no  guarantee  that  the  polarization  or 
phase  are  constant  in  frequency.  In  the  presence  of  strong 
scattering,  one  might  not  expect  a  respectable  "pure  state" 
polarization  at  any  frequency.  Alternatively,  coherent 
addition  of  scattered  waves  within  the  crustal  waveguide 
will  produce  traveling  modes  whose  signature  in  extended 
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body  wave  codas  may  be  a  well-defined  polarization  and 
phase  that  varies  with  frequency.  The  distinct  spectra; 
peaks  seen  by  Park  et  at.  [this  issue]  in  seismic  spectra 
observed  on  the  Anza  Seismic  Telemetered  Array  [Berger 
et  ai.  1984|  suggest  that  waveguide  modes  may  be  evident 
in  the  complex  waveforms  of  events  at  epicentral  distances 
of  100-250  km  Inhomogeneities  in  the  crustal 
waveguide  can  lead  to  scattering  and  coupling  of  these 
propagating  modes  (see.  e  g  ,  Kennett  [1986]  and  Odom 
[1986]  for  a  description  of  these  effects)  which  will,  in 
general,  cause  frequency  dependent  scattering  In  such 
cases,  it  is  more  useful  to  determine  the  type  of  seismic 
motion  from  us  polarization  signature,  as  in  the  study  of 
Vidale  [1986],  than  to  attempt  to  isolate  phases 

In  this  paper  we  develop  and  demonstrate  another  algo¬ 
rithm  for  determining  the  frequency  dependence  of  the 
polarization  of  high-frequency  seismic  records  We  have 
used  multilaper  spectral  analysis  [ Thomson .  1982]  to  esti¬ 
mate  the  spectral  density  matrix  S  (,/  )  of  Samson  [1983a] 
This  has  several  advantages  By  employing  prolate 
spheroidal  wave  functions  as  tapers  (instead  of  cosine  or 
boxcar  tapers)  to  obtain  direct  spectral  estimates,  the  ele¬ 
ments  of  the  estimated  spectral  density  matrix  will  be  less 
biased  [Lindberg.  1986;  Park  et  ai.  1987]  It  is  also  not 
necessary  to  apply  a  moving  average  to  the  density  matrix 
estimate  to  smooth  it;  smoothing  is  obtained  by  summing 
the  eigenspectra  of  each  component  of  motion  (see  equa¬ 
tion  (3)).  Using  multitapers  to  estimate  the  spectral  den¬ 
sity  matrix  is  more  suitable  for  very  short  records,  such  as 
those  which  include  a  single  seismic  phase.  This  is 
because  data  are  not  discarded  by  applying  a  single  bell¬ 
shaped  taper  to  the  record.  (A  similar  method  has  been 
independently  developed  and  applied  to  magnetometer 
data  by  Lanzerotti  et  al.  [1986]  .) 

We  analyze  a  number  of  three-component  records  of 
seismic  codas.  In  these  observations  the  source  pulse  has 
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been  dispersed  and  scattered  within  the  crust.  In  an  ideal¬ 
ized  picture  the  shape  of  the  source  spectrum  is  retained 
in  the  shape  of  the  coda  spectrum,  but  the  spectral  phase 
is  randomized  by  scattering  effects.  Despite  this  random¬ 
ized  phase,  one  might  expect  the  particle  motion  to  retain 
the  polarization  behavior  of  the  type  of  wave  motion  dom¬ 
inant  within  a  selected  frequency  band.  Polarization 
analysis  in  the  frequency  domain  offers  an  opportunity  to 
characterize  the  signal  better.  With  three-component  data 
we  have  potentially  three  independent  polarizations.  If 
scattering  is  not  great,  a  single  polarization  will  predom¬ 
inate.  This  assumption  is  often  true  for  the  P  wave  coda. 
If,  for  instance,  interaction  with  crustal  structure  decou¬ 
ples  SH  and  ST  motion,  there  may  be  two  principal  polar¬ 
izations  in  the  S  wave  coda.  The  algorithm  we  describe  in 
this  paper  offers  a  quantitative  criterion  for  identifying  the 
single  dominant  polarization. 

In  section  2.  our  multitaper  polarization  analysis 
method  is  described.  We  apply  the  algorithm  to  a  syn¬ 
thetic  pulse  example  in  section  3.  In  section  4  we  show 
examples  from  the  P  wave  codas  of  data  observed  on  the 
Anza  Seismic  Telemetered  Network.  Section  5  summar¬ 
izes  our  findings.  Uncertainty  estimates  for  polarization 
angles  and  phases  are  derived  in  the  appendix. 


2.  Polarization  Analysis  With 
THL  MLLTITAPER  ALGORITHM 
Polarization  analysis  involves  determining  the  eigen- 
structure  of  the  spectral  density  matrix  SI/).  Suppose 
one  has  three-component  data  recorded  in  the  time 
domain  of  the  form 

x  (t )  =  (.\lI/),.x:(r),.v'(r ))  t  =  nr :  n  =  0,  1 . ,V~  1 


where  r  is  the  sampling  interval.  Nr  is  the  length  of  the 
time  series,  the  coordinate  system  is  right-handed,  and 
-v 1  ( r )  is  the  vertical  component.  If  the  yth  record  v  (r> 
has  the  frequency  domain  representation  ;  (/).  the  spec¬ 
tral  density  matrix  S  (/)  has  components 


S*</>  -  £!  (J  )V  (,/')! 

where  £  denotes  the  expectation  operator  Samson 
[1983a]  forms  an  estimate  of  the  spectral  density  matrix, 
SI/),  with  components 


SJt  (/>=  (.*-'(/)) VI/)  ij  =  1.2.3 


where 


v  (/>  =  — r—  X  n,..v  (;tr)< 

A' 


(!) 


is  a  discrete  Fourier  transform  of  the  ,/ih  component  of 
x(r)  and  !*»■„[„'  is  a  chosen  data  taper  The  matrix  S (./  ) 
is  then  smoothed  in  the  frequency  domain  by  applying  a 
moving  average,  and  the  eigenvectors  and  eigenvalues  of 
the  smoothed  matrix  are  found 
To  apply  the  muliitaper  algorithm  to  the  estimation  of 
S(/ ).  one  employs  a  set  of  K  prolate  spheroidal  wave 

function  "eigentapers"  i-„“  ’(.V,  W):  A  =  0. 1 . A"--!. 

which  are  optimally  resistant  to  spectral  leakage  from  out¬ 
side  a  chosen  frequency  band  of  width  2 W  \Thomson. 
1982.  Lindbery.  1986;  Park  el  at..  1987],  For 
A  =  0, 1 . A"-  1  the  spectral  estimates 


>."'</)  = 
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of  each  component  of  x(r)  can  be  made.  Then  a  multi¬ 
taper  estimate  of  the  spectral  density  matrix  is 

-j-  M"(f)  ■  M(/)  (3) 

A 


where  superscript  H  denotes  conjugate  transpose  and 


MI,/) 


.v,l "(f)  >•«!-”(/)  (/) 

>■!"(/)  >','•”  (/)  y!h<J) 

(A)  (/)  >■;/</» 


The  value  of  K.  the  number  of  eigenspectra  used, 
depends  on  2fT,  the  width  of  the  frequency  band  in  which 
the  spectral  energy  at  frequency  /  is  concentrated.  The 
A’  =  2A,W/7-l  lowest  order  eigentapers  possess  sufficient 
spectral  leakage  resistance  to  be  useful  [ Slepian ,  1983]. 

To  investigate  the  eigenstructure  of  SI/),  we  perform  a 
singular  value  decomposition  M(/)  =  L  •  D  •  Vw.  where  L 
is  a  K  x  A'  unitary  matrix  of  left  eigenvectors  of  M,  V  is  a 
3x3  unitary  matrix  of  right  eigenvectors  v,  of  M.  and  D 
is  a  A' x  3  matrix  with  D„  =  d) ,  j-  1.2,3.  the  singular 
values  of  M,  and  D„  =  0  for  /  =  /  [Golub  and  Van  Loan. 
1983],  The  polarization  vector  i  is  the  right  eigenvector 
corresponding  to  the  largest  singular  value  of  the  matrix 
M.  It  specifies  the  direction  of  particle  motion  at  fre¬ 
quency  /  which  contains  the  largest  fraction  of  seismic- 
energy  [Sdmsor;.  19836].  The  components  of  z  can  be 
complex,  allowing  for  phase  lags  between  components. 
Phase  lags  between  components  represent  elliptical  particle 
motion.  Our  ability  to  identify  z  with  the  principal  polari¬ 
zation  of  motion  at  /  can  be  qualitatively  assessed  by 
comparing  the  singular  values  dt  >  d:  d\.  If 
</,  »  </./,.  the  polarizatior  z  =  v ,  is  well  determined. 
We  can  use  the  ratio  of  the  singular  values  to  estimate  the 
uncertainty  in  z  and  any  quantities  we  calculate  from  it. 
The  estimation  of  the  polarization  uncertainty  follows  the 
derivation  of  Park  and  Chaw  [1984]  and  is  outlined  in  the 
appendix.  If  dt  ;:=  d:  »  there  is  a  strong  possibility 
that  coherent  seismic  motion  at  /„  exists  at  two  separate 
polarizations.  The  dot  product  v,  v;=0  bv  virtue  of  the 
singular  value  decomposition,  but  this  orthogonal  relation¬ 
ship  need  not  carry  over  into  the  seismic  polarizations.  In 
an  5  wave  arrival,  one  expects  51  and  SH  motion  to  be 
orthogonal  to  first  order  in  most  situations,  but  the  super¬ 
position  of  other  signals  (e  g  .  reflected  P  arrivals)  need 
not  have  orthogonal  polarizations. 

If  /  »</-.(/•,.  the  three-component  particle  motion 
x  (z )  in  the  neighborhood  of  frequency  ./  can  be 
represented  by  the  real  part  of  Rie  2~”.  where  R  is  the 
amplitude  of  motion  We  can  adjust  the  phase  of  z  so  that 
R  is  real  If  there  exists  a  phased)  such  that zc"  is  purely 
real,  then  all  motion  described  by  z  lies  along  a  single  line 
in  three  space  More  generally,  particle  motion  will  follow 
an  ellipse  confined  to  the  plane  spanned  by  the  two  real 
vectors  Relz)  and  Imlz).  If  this  ellipse  is  strongly 
elongated  along  its  major  axis,  reasonable  horizontal  and 
vertical  azimuths  can  be  found.  If  the  wave  type  is 
known,  such  as  a  P  wave,  then  the  propagation  direction 
can  be  determined.  Strongly  elliptical  polarization  suggests 
modelike  particle  motion  (for  example,  a  Rayleigh  wave) 
with  a  poorly  defined  angle  of  incidence 

We  can  project  the  particle  motion  described  by  the 
complex  unit  vector  z  onto  an  ellipse  in  the  horizontal 


i:.b* 


P  \Kk  i  I  \L  Fklyl  I  NO  Dm  M» \r  PnURI/XIIOS  A\  UYXIX 


Fig  1  Diagram  to  illustrate  the  definitions  of  the  polarization 
angles  Bw  and  B,  .  The  azimuth  Bw  is  restricted  to  [-180°. 
180°1  and  is  measured  counterclockwise  from  e2.  The  angle  Bw 
is  chosen  by  determining  the  maximum  horizontal  displacement 
of  the  particle  motion  for  which  B,  will  fall  in  the  range 
Os:  B,  $  90°.  The  ellipticity  of  the  particle  motion  is  defined  by 
the  amplitudes  |z,|,  |r2 j.  |z3l  and  the  phase  angles  <hHH  and<A,w 
(defined  in  text). 


elliptical  motion  with  the  major  and  minor  axes  oriented 
along  the  axes  of  the  instruments.  If  z2  =  ±/z 3,  the  parti¬ 
cle  motion  is  circular,  with  no  definable  azimuth.  In  this 
case,  the  uncertainty  in  Hh  ,  given  in  the  appendix,  goes  to 
infinity  as  it  is  proportional  to  |zf  +zj  I1. 

The  expressions  relating  horizontal  to  vertical  motion 
are  similar.  We  want  to  find  the  angle  <-) ,  made  with  the 
vertical  by  the  major  axis  of  the  ellipse  defined  by 
Re(zf2"y').  Define  the  phase  angles 


=  2rr/r  =  -'/iarglzf-kz/j]  •+■  (9) 

where  m  is  an  integer  and  z#  =  zt  j-z} .  The  phase  angle 
Hi  is  the  value  of  W,„  at  an  m  for  which  the  particle 
motion  displacement  is  maximized.  The  angle  of 
incidence  is 


(-),  =  tan 


Relz.e  1 
Re[z„e 


(10) 


plane  which  is  defined  by  iH  =i  -  (l|  i)e,.  where 
ei=  (1,0,0).  The  major  axis  of  this  horizontal  ellipse  is 
taken  to  be  the  principal  direction  of  horizontally  polarized 
motion.  To  find  the  azimuth  of  the  major  axis,  we  deter¬ 
mine  the  point  of  greatest  displacement  for  the  projection 
zH  in  the  horizontal  plane  by  finding  the  maximum  value 
of 


where  lmzrt  i#  0.  The  absolute  value  is  taken  to  restrict 
<-),  to  lie  between  0°  and  90°,  the  usual  convention  for 
the  angle  of  incidence  (Figure  1).  The  phase  lag  between 
vertical  and  horizontal  motion  can  also  be  defined.  Define 
<At/,  =  —</>i.  Since  the  end  points  of  the  major  axis  of 

the  horizontal  motion  ellipse  correspond  to  f)w  and 
H,t  ±  it,  we  can  restrict  the  range  of  </>,w  to  (-90°.  90°). 


]ReUf/g'!"")|:  (4)  3.  A  Synthetic  Example 


If  the  components  (Z|,z2,z3)  of  z  are  expressed  in  the 
form  Zj  =  I Zj \e ,  this  is  equivalent  to  finding  the  maxima 
of 

|z2|2cos2(27r/r  •J-<ft2)  +  |  -  j  |  ^cos’  (2zr  //  -*-<*,)  (5) 

The  extremes  of  this  expression,  remembering 

|z2|  2sin  2<f>2  +  |z3|2sin  2<A3  =  Im(zf-*-zr)  (6) 

are  found  when  the  phase  angle  H  defined  as  0=2 nft 
takes  the  values 

0,  =  -  '/zargtzt+zr]  ~  (7) 

where  (  is  an  integer.  Let  (  be  the  integer  closest  to  zero 
which  minimizes  (5),  the  horizontal  displacement,  and  for 
which  Re(Z|)  <  0.  Define  the  phase  angle  to  be  the 
value  of  H(  for  this  (.  Once  f)H  has  been  determined,  the 
horizontal  azimuth  of  the  major  axis  i-)H  measured  coun¬ 
terclockwise  frome2  =  (0,1,0)  can  be  defined  as 

(-)„  =  tan  '  Re<Z]t>  =  Redan  Hzjz,))  (8) 

Re(z2e  11 ) 

The  range  of  the  arctangent  function  is  0°<  ^  180°  if 

Re(z,z* )  <  0  and  -180°<«„  ^  0°  if  Re(z,z3‘ )  S?  0.  If 
the  particle  motion  is  P  like,  (-)w  can  be  interpreted  as 
pointing  in  the  direction  of  the  wave  source.  A  represen¬ 
tation  of  an  elliptical  motion  for  which  <-)H  <  0  is  shown  in 
Figure  1. 

Another  useful  quantity  is  <A j  —  <fc2  =  »*//// ,  the  phase 
difference  between  the  horizontal  components  of  particle 
motion.  If  (A2-<A3  =  0°  or  180°,  the  particle  motion  is 
predominantly  linear.  The  value  <ft2  -<t>\  =  90°  represents 


We  first  illustrate  the  definitions  of  <•)„,  (-),  ,  and 
<b,„  in  a  synthetic  example.  We  constructed  a  three- 
component  record  (Figure  2)  from  a  sum  of  cosinusoids: 

mo  .1  f 

x[(m)  «*  £  cos  cos  (2rr/nr  -  -7^-) 

!  Q .  80  I  30 

mo  f  ^  ,  f 

x2(nr)  -  £  cos  sjn  cos  (Inf nr)  (11) 

/  0  l  20  80 

100  f  . 

xHnr)  =  £  sin  sin  cos  (2  rtf  nr) 

t  0  20  80 

where  n  =  0. 1 . V-l  and  the  sampling  interval  is 

r  =  0.004  s.  The  polarization  vector  of  this  signal  can  be 
written  immediately  as 


Polarization  test  series 


time(sec) 


Fig  2.  Polarization  test  series 
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Fig.  3.  (a)  Amplitude  spectra  and  polarization  angles  calculated  from  the  test  series.  Spectra  for  components  I 
(solid  line),  2  (coarse  dashed  line),  and  3  (fine  dashed  line).  (2>)  The  singular  value  associated  with  principal  polari¬ 
zation  is  plotted  against  frequency  (solid  line),  and  the  secondary  singular  values  (dashed  lines),  (c)  Horizontal 
azimuth  of  particle  motion.  ( d )  Phase  angle  defined  by  the  major  and  minor  axis  of  the  horizontal  particle  motion 
ellipse  (<■)  Angle  of  incidence  of  particle  motion  measured  from  nadir  (/)  Phase  angle  defined  by  major  and 
minor  axis  of  the  vertical  particle  motion  ellipse. 
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where  =  0  and  <t> i„  =  (--rrf); 50.  Figure  3  shows  the 
results  of  a  multitaper  polarization  analysis  for  frequencies 
0  <  /  <  30  Hz.  The  uncertainties  are  plotted  as  one  stan¬ 
dard  deviation  error  bars  in  this  and  succeeding  figures. 
Figure  3b  shows  the  three  scaled  singular  values  as  a  func¬ 
tion  of  frequency.  The  principal  polarization  appears  well 
determined.  The  amplitude  spectra  for  the  three  com¬ 
ponents  are  plotted  in  Figure  3a.  The  angles  (-)w  and  <1> ,w 
are  plotted  in  Figures  3c  and  3d.  The  angle  </>ww  is  not 
well  determined  near  zero  frequency,  as  the  horizontal  sig¬ 
nal  amplitude  is  dwarfed  by  vertical  component  energy. 
The  apparent  horizontal  azimuth  <-)w  "wraps  around"  from 
180°  to  -180°  at  20  Hz  and  jumps  180°  at  25  Hz.  The 
former  jump  is  obvious;  the  latter  is  an  artifact  of  </>,„ 
passing  through  90°.  The  phase  angle  estimated 

from  the  synthetic  record,  has  a  value  of  0°  or  ±  180°,  to 
observational  accuracies.  These  values  correspond  to  rec¬ 
tilinear  motion  and  are  dependent  on  the  quadrant  where 
the  horizontal  azimuth  is  directed.  The  phase  lag 
between  vertical  and  horizontal  components  is  well  deter¬ 
mined  everywhere  except  very  near  zero  frequency  where 
the  horizontal  component  amplitude  vanishes.  The  ellipti- 
city  of  particle  motion  disrupts  the  linear  trend  in  <4,  ,  as 
shown  in  Figure  3e.  At  25  Hz,  </>,„  =  90°  and  the  particle 
motion  is  an  ellipse  with  major  and  minor  axes  oriented 
horizontally  and  vertically,  respectively.  Therefore 


<->,  =90°  Mi  25  Hz.  At  higher  frequencies.  <A,/,  >  90°.  the 
relative  sign  of  vertical  and  horizontal  motion  reverses, 
and  the  particle  motion  ellipse  "tips"  in  an  opposite 
manner  relative  to  its  orientation  for  <1>  ,H  <  90°.  This 
causes  the  observed  180°  jump  in  apparent  horizontal 
azimuth  <-)„.  This  example  suggests  that  one  should  use 
caution  in  interpreting  the  angles  <-)/,  and  <->i  wherever  the 
particle  motion  is  nearly  fully  elliptical,  i.e..  when  or 
d>iH  is  within  20°  of  ±90°. 

4.  Data  Examples 

We  illustrate  this  method  of  determining  the  polariza¬ 
tion  as  a  function  of  frequency  with  several  examples.  We 
analyzed  several  waveforms  which  were  recorded  on  the 
Anza  array  after  an  earthquake  that  occurred  at 
0521:39.5  UT,  September  9.  1982,  with  hvpocenter  posi¬ 
tioned  at  32.93°N,  115.85°W,  and  depth  4.2  km.  The 
magnitude  A/,  was  determined  to  be  4.4.  The  event  was 
located  near  Superstition  Mountain,  California,  on  the 
western  edge  of  the  Imperial  Valley.  The  earthquake  was 
recorded  on  only  four  stations  in  the  array  (PFO,  KNW, 
FRD,  and  CRY;  see  Berger  et  at.  11984]  for  the  definitions 
of  these  three-letter  acronyms)  as  the  event  occurred  prior 
to  the  completion  of  the  array.  The  hvpocenter  was 
roughly  100  km  southeast  of  the  array.  The  m  =  1  com¬ 
ponent  is  the  vertical  seismometer  output  with  positive 
motion  defined  as  up.  We  choose  the  m  =  2  component 
so  that  positive  motion  points  45°  east  of  north.  Positive 
motion  along  the  m  =  3  axis  is  directed  45°  west  of  north, 
forming  a  -ight-handed  coordinate  system.  Let  the  angle 
<-)//  be  measured  counterclockwise  from  the  primary  hor- 
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Fig  4  Anza  data  used  in  polarization  example  Range  in  kilometers  and  expected  <->,,  are  gnen  in  right-hand 
columns.  Maximum  amplitude  in  counts  is  given  in  left-hand  column,  along  with  station  code  and  component 
number 


izontal  axis  <m  =  2).  If  the  wave  propagation  is  along  a 
straight  line  connecting  the  source  and  receiver, 
-85°  >  (-)„  >  -98°  for  the  four  stations.  The  first  30  s 
of  recorded  motion  for  this  event  are  shown  in  Figure  4. 
along  with  range  and  azimuth  information  (azimuth  is 
measured  counterclockwise  from  N45°E)  Both  S  and  P 
arrivals  are  extended  wave  trains,  although  the  S  energy  is 
more  concentrated  in  lime  An  interesting  feature  of  this 
event  is  the  small  precursor  to  the  main  P  arrival,  shown 
in  the  enlarged  detail  for  stations  FRD  and  CRY  in  Fig¬ 
ures  5 a  and  5b.  This  waveform  corresponds  to  a  lower 
crustal  phase. 


Polarization  analysts  reveals  that  the  first  arrivals  have 
complicated  polarization  signatures.  The  time  window 
taken  is  short  (1.6s),  corresponding  to  a  Rayleigh  fre¬ 
quency  L  t.Vr )  of  0.625  Hz.  Analysis  using  seven  4jt  pro¬ 
late  tapers  averages  energy  over  a  band  of  width  8.  (,Vr), 
so  that  all  of  the  estimates  shown  represent  an  average 
over  a  5-Hz  frequency  band.  If  the  true  polarization 
varied  significantly  over  this  bandwidth,  one  would  expect 
</>,„.  and  <bl0l  to  be  relatively  poorly  determined 
The  results  for  FRD  are  shown  in  Figure  6  The  singular 
values  </'  and  <J\  displayed  in  Figure  6 b  show  local  maxima 
at  several  places  in  the  spectrum  from  0  to  30  Hz.  Max- 
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Fig  5  ( a )  Plots  of  precursory  waveform  observed  on  station 

FRD  (/>)  Plots  of  precursory  waveform  observed  on  station 
CRY  The  portion  used  for  spectrum  analysis  is  bounded  by- 
dashed  lines  Both  horizontal  components  at  station  FRD  exhibit 
visible  60-Hz  power  line  noise.  The  spectral  leakage  resistance  of 
the  4n-  prolate  eigentapers  used  in  the  analysis  guards  against  bias 
in  the  frequency  band  of  interest 

ima  at  2.5.  7.5,  and  14  Hz  correspond  to  boundaries 
between  distinct  spectral  features  (Figure  6a).  All  the 
maxima  below  25  Hz  correspond  to  frequencies  at  which 
one  or  more  of  the  polarization  angles  change  rapidly 
Horizontal  motion  is  roughly  rectilinear  below  13  Hz,  but 
its  azimuth  is  variable  and  significantly  different  from  the 
nominal  azimuth  of  -87°.  In  fact,  the  largest  amplitude 
signal,  from  8  to  13  Hz.  is  oriented  clockwise  125°  from 
the  primary  component,  a  deflection  of  nearly  40°  from 
the  nominal  P  wave  arrival  azimuth.  The  phase  lag 
between  horizontal  and  vertical  motion  is  alternately  posi¬ 
tive  and  negative  in  adjacent  frequency  bands  but  is  never 
more  than  partially  elliptical.  The  angle  between  vertical 
and  horizontal  motion,  which  can  be  interpreted  in  this 


case  as  the  nominal  angle  of  incidence,  vanes  smoothly 
with  frequency  in  Figure  be,  with  <->,  -  25°  30°  for 
/  <  10  Hz,  and<->,  =  15°  above  13  Hz 

Figure  7  shows  an  analysis  of  the  small  amplitude  P 
precursor  observed  at  station  CRY.  The  variation  ol  the 
largest  singular  value  d t  with  frequency  shows  four  fre¬ 
quencies  (2.5,  7,  12,  and  16  Hz)  at  which  the  principal 
polarization  vector  is  poorly  determined,  and  there  is  a 
peak  in  d2.  Each  of  these  peaks  in  d2  occurs  where  there 
is  an  abrupt  change  in  the  three-component  spectra  and  in 
one  or  more  of  the  polarization  angles  Although  the 
estimated  uncertainties  are  larger  than  those  in  the  last 
example,  the  variability  among  frequency  bands  is  clearly 
visible  in  Figures  Ic—f.  Motion  in  the  horizontal  plane  is 
dominantly  elliptical  below  14  Hz,  but  particle  rotation 
proceeds  in  opposite  senses  in  the  two  frequency  bands 
2.5  Hz  <  /  ^  7  Hz  and  7  Hz  <  /  <  14  Hz.  The  azimu¬ 
thal  angle  <-)„  hovers  near  the  value  expected  for  the  epi¬ 
center  (-85°).  but  our  synthetic  example  in  Figure  3  sug¬ 
gests  that  this  may  be  due  to  the  —90°  phase  lag  between 
component  motions.  At  higher  frequencies,  including  the 
substantial  spectral  peak  at  18-20  Hz.  the  observed  hor¬ 
izontal  azimuth  of  particle  motion  is  roughly  transverse  to 
the  arrival  azimuth,  as  though  the  energy  at  these  fre¬ 
quencies  were  SH  in  character  A  better  interpretation  is 
in  terms  of  side-scattered  P  energy,  as  the  vertical 
azimuth  of  particle  motion  (-),  remains  in  the  20°-  40° 
range  across  all  frequencies  in  Figure  7c. 

Similar  behavior  is  observed  on  stations  PFO  and  KN\A 
The  nature  of  this  polarization  behavior  is  quite  puzzling 
It  is  unlikely  that  instrument  calibrations  are  at  fault  A 
timing  error  among  components  would  result  in  a  linear 
drift  in  the  relative  phase  angles,  similar  to  that  shown  in 
Figure  3/  There  are  no  poles  or  zeroes  in  the  instrument 
response  over  the  frequency  region  shown  A  perturba- 


Ftg  6  Amplitude  spectra  and  polarization  angles  for  precursory  waveform  observed  at  station  FRD  Solid/dashed 
line  conventions  are  identical  to  those  of  Figure  3 
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Fig  7.  Amplitude  spectra  and  polarization  angles  for  precursory  waveform  observed  at  station  CRY  Soltd/dashed 
line  conventions  are  identical  to  those  of  Figure  3. 


(ion  in  the  response  filter  characteristics  would  have 
difficulty  mimicking  the  apparent  boundaries  between 
spectral  processes.  Moreover,  we  show  below  that  the 
relative  polarization  shift  from  frequency  band  to  fre¬ 
quency  band  varies  greatly  within  the  P  coda.  This  argues 
for  a  signal-generated  effect  rather  than  an  instrument 
effect  This  behavior  may  reflect  the  modal  structure  of 
an  mtercrusial  head  wave  in  a  stratified  crust.  Another 
interpretation  is  in  terms  of  resonant  vibrational  modes  in 
the  earth  structure  near  the  receiver  Structure  of  scale 
lengths  100-200  m  could  account  for  the  higher- 
frequency  resonances  observed  in  Figures  6  and  7. 

We  performed  experiments  to  see  if  such  resonant 
behavior  could  be  found  in  the  P  codas  for  this  event 
When  the  entire  coda  was  used  for  polarization  analysis, 
the  results  were  poor  The  three-component  seismogram 
recorded  at  station  KNW  is  shown  in  Figure  8  Figure  9 
presents  polarization  data  from  the  14-s  P  wave  coda 
There  appear  to  be  competing  signals  at  nearby  frequen¬ 
cies,  creating  either  rapid  variations  in  the  polarization, 
which  are  difficult  to  interpret,  or  else  large  uncertainties 
in  the  polarization  Likewise,  the  presence  of  both  S^'- 
and  SH -polarized  energy  in  the  S  arrivals  made  the 
identification  of  a  "principal"  polarization  uncertain. 

We  chose,  therefore,  to  analyze  the  P  codas  of  these 
records  in  successive  2-s  (500  sample)  segments.  We 
observed  what  appear  to  be  resonances  over  4-6  Hz  fre¬ 
quency  bands  and  variations  in  polarization  over  time  that 
suggest  the  arrival  of  P  energy  which  has  been  scattered 
within  the  crust  The  results  of  a  polarization  analysis  of 
the  first,  fourth,  and  sixth  2-s  time  segments  of  the  P 
wave  coda  recorded  at  station  KNW  are  shown  in  Figures 
10-  12  The  growth  of  the  "noise"  singular  values  </;  and 
d.  as  the  time  window  moves  through  the  coda  suggests 


an  increase  in  scattered  energy.  The  most  prominent 
features  in  the  spectra  of  the  principal  polarization  com¬ 
ponents  are  the  spectral  peaks  near  5  and  14  Hz.  Com¬ 
parison  of  the  values  of  <->„  in  the  time  windows  indicates 
that  there  is  a  boundary  between  two  distinct  spectral 
processes  at  7-7.5  Hz.  The  7-14  Hz  process  is  character¬ 
ized  by  dominantly  rectilinear  horizontal  motion  and  stee¬ 
ply  vertical  particle  motion.  The  relative  phase  angles  </«,// 
and  ’b„/i  for  the  lower-frequency  process  exhibit  more 
variability.  Within  a  2-s  time  window  the  horizontal 
azimuth  varies  only  slightly  within  the  0—7  Hz  frequency 
band,  with  more  shallow  vertical  angles.  Figure  12e  shows 
that  <->,  60°  in  this  frequency  band,  which  may  indicate 

SF -con  verted  motion.  Particle  motion  at  frequencies 
greater  than  15  Hz  bears  little  relation  to  the  higher- 
amplitude  low-frequency  signal  and  often  cannot  be  inter¬ 
preted  in  terms  of  P-.  SI  -  or  5W-polanzed  motion  travel¬ 
ing  directly  from  source  to  receiver. 
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Fig  8  Three-component  seismogram  tor  Superstition  Mountain 
event  observed  at  An/a  station  KNW  The  14-s  segment  chosen 
tor  polarization  analysis  is  within  the  dashed  lines 
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Fig  9  Amplitude  spectra  and  polarization  angles  for  the  14-s  P  coda  segment  shown  in  Figure  8  Solid/dashed  line 
conventions  are  identical  to  those  of  Figure  3 


The  similar  frequency  dependences  of  <->/,  and  <->,  in 
these  2-s  time  windows  contrasts  with  the  absence  of  a 
clear  pattern  in  the  larger  time  analysis  shown  in  Figure  9 
Similar  effects  are  found  when  records  from  the  other 
three  stations  for  this  event  are  analyzed  This  is  not 
surprising  when  one  notes  the  large  variation  of  polariza¬ 
tion  among  the  three  time  windows  shown  in  Figures 


10-12.  The  azimuth  of  the  epicenter  has  <->/,  *=■  -92c  (i.e.. 
clockwise)  from  the  second  component  The  horizontal 
azimuth  <->„  of  particle  motion  is.  for  7.5  Hz 
<  /  <  14  Hz,  always  oriented  more  to  the  soulh.  with 
values  that  vary  among  time  windows  by  40c  or  more  At 
/  <  7  Hz.  several  of  the  time  windows  tested  were  con¬ 
sistent  with  -92°  relative  azimuth,  but  the  fourth  and 
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Fig  10  Amplitude  spectra  and  polarizaton  angles  for  the  first  2  s  of  the  14-s  P  coda  scgmeni  shown  in  Figure  8 
Solid/dashed  line  conventions  are  identical  to  those  of  Figure  3 
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Fig.  11  Amplitude  spectra  and  polarization  angles  lor  the  seventh  and  eighth  seconds  of  the  14-s  P  coda  segment 
shown  in  Figure  8.  Solid/dashed  line  conventions  are  identical  to  those  of  Figure  3. 


sixth  segments,  shown  in  Figures  11  and  12,  show  particle 
motion  whose  horizontal  orientation  is  nearly  pure  east- 
west  We  take  this  variation  as  evidence  for  the  arrival  of 
scattered  off-azimuth  P  energy 

A  detailed  interpretation  of  these  results  is  beyond  the 
scope  of  this  paper,  but  we  can  draw  parallels  with  recent 
studies  of  high  frequency  seismic  spectra  Seretto  and 


Oram  (1985]  have  shown  that  the  extended  P„  wave  train 
observed  in  ocean  bot.om  seismic  data  can  be  modeled  by 
reverberations  in  the  oceanic  sediment  layer  and  overlying 
water  column,  buttressing  their  comparison  by  demon¬ 
strating  a  simple  pattern  of  spectral  peaks  corresponding  to 
leaky  vibrational  modes.  Bard  and  Bouchon  [1985]  have 
shown  spectra  from  seismic  events  for  which  the  retrieval 
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Fig  12  Amplitude  spectra  and  polarization  angles  for  the  eleventh  and  twelfth  seconds  of  the  14-s  P  coda  segmeni 
shown  in  Figure  8  Solid/dashed  line  conventions  are  identical  to  loose  of  Figure  3 
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of  simple  source  parameters  like  corner  frequency  and 
high-frequency  roll-off  is  contaminated  by  a  high- 
frequency  resonance  which  they  model  as  a  reverberation 
in  the  low-velocity  surface  layer  The  apparent  polariza¬ 
tion  resonances  observed  in  the  P  wave  codas  of  the  Sep¬ 
tember  1982  Superstition  Mountain  event  probably  argue 
for  an  even  more  complex  structure  than  was  postulated 
in  these  studies. 

The  interpretation  of  the  coda  using  resonance  models 
may  offer  a  more  direct  method  for  characterizing  near¬ 
receiver  structure  than  time-domain  models  of  scattered 
waves  (e  g  .  Saw.  1984],  If  the  resonances  of  the  struc¬ 
ture  beneath  one's  receivers  ate  known,  we  can  hope  to 
determine  better  the  spectral  shape  of  the  original  seismic 
source.  If  we  model  the  response  R(/,z)  of  the  crustal 
structure  local  to  a  receiver  to  waves  traveling  in  the 
lithospheric  wave  guide  with  frequency  /  and  polarization 
i .  we  expect  observed  three-component  amplitude  spectra 
l  (/)  to  be  found  by  integrating 

l’(/)  =  f  Rf/.f  Is (/.£></ U  (13) 

i  I 

where  s(/,z)  is  the  amplitude  of  the  impinging  signal 
We  integrate  i  over  the  lower  half  of  the  unit  sphere  in 
order  to  account  for  energy  arriving  from  all  vertical 
azimuths  and  out  of  plane  scattering  In  the  example  of 
Sereno  and  Orcun  ( 1 985] .  R(/.i  )  was  calculated  for  a  sim¬ 
ple  layered  model  For  arrays  (such  as  Anza)  positioned 
atop  a  heterogeneous  medium,  constraints  on  R  (./  ,z)  can 
be  found  empirically  using  a  number  of  events  at  different 
azimuths  Determination  of  R  (./.£)  may  be  helpful  in 
evaluating  the  earthquake  hazards  of  a  potential  building 
site,  especially  as  polarization  analysis  specifies  both 
seismic  amplitude  and  particle  motion  at  the  recording 
site  More  research  is  necessary  to  determine  if  such  a 
project  is  feasible  The  above  examples  suggest  that 
sf/.z)  varies  significantly  within  the  coda,  complicating 
the  determination  of  the  near-receiver  resonant  structure 

5  Conclusions 

We  have  devised  a  multitaper  algorithm  to  determine 
the  polarization  of  particle  motion  as  a  function  of  fre¬ 
quency  and  applied  it  to  data  recorded  on  the  Anza 
Seismic  Telemetered  Array  [Berger  el  at..  1984]  We  form 
a  matrix  of  eigenspectra  of  three-component  records  and 
perform  a  singular  value  decomposition  to  estimate  the 
complex-valued  unit  vector  z  whose  components  specify 
the  sense  of  particle  motion  in  the  plane  defined  by  the 
two  real  vectors  Re (i )  and  Im(z).  We  manipulate  the 
components  of  z  in  order  to  specify  four  angles.  The 
angle  <bHh  represents  the  relative  phase  between  the  com¬ 
ponents  of  horizontal  motion  The  angle  <ftw//  =  0°  or 
±  180c  if  the  particle  motion  is  rectilinear  in  the  horizontal 
plane,  and  <t>„n~  ±90°  if  the  motion  is  elliptical  and 
oriented  along  the  component  axes  The  phase  angle  <*,„ 
is  the  relative  phase  between  horizontal  and  vertical 
motion  The  apparent  azimuth  Ww  is  defined  by  the  max¬ 
imum  displacement  of  the  horizontal  projection  of  the  par¬ 
ticle  motion  ellipse  It  is  measured  in  the  counterclock¬ 
wise  direction  from  the  first  horizontal  component 
Finally,  an  angle  of  incidence  <->,  of  the  particle  motion  is 
estimated  The  uncertainties  in  these  polarization  angles 


can  be  estimated  from  the  singular  value  decomposition 
used  to  obtain  z  (appendix). 

The  variability  of  the  spectra  and  polarization  over 
0  ^  <  30  Hz  suggest  that  the  P  coda  observations  can 

be  separated  into  several  distinct  varieties  of  seismic 
motions,  each  occupying  a  separate  frequency  band.  This 
behavior  suggests  that  in  some  cases  it  may  be  more 
appropriate  to  model  the  P  wave  coda  as  a  set  of  resonant 
modes  caused  by  near-receiver  structure  rather  than  a 
number  of  randomly  scattered  compressional  pulses.  Evi¬ 
dence  for  scattered  energy  is  not  lacking,  however,  as  the 
principal  polarization  accounts  for  a  smaller  proportion  of 
the  total  seismic  energy  late  in  the  P  coda,  accounting  for 
only  60-65%  in  some  frequency  bands  We  also  observe 
that  the  apparent  P  wave  arrival  azimuth  can  vary  by  up 
to  50°,  both  between  adjacent  frequency  bands  and  in 
adjacent  lime  windows.  Both  rectilinear  and  elliptically 
polarized  signals  are  found,  often  coexisting  in  the  same 
time  window  in  adjacent  frequency  bands.  We  find  that 
the  apparent  modal  structure  of  the  signal  polarization 
breaks  down  if  the  length  of  the  time  window  is  much 
greater  than  2  s.  suggesting  incoherent  excitation  by  direct 
and  scattered  seismic  waves. 

We  are  currently  investigating  the  polarization  behavior 
of  the  data  recorded  at  each  site  in  the  Anza  array.  We 
want  to  use  the  polarization  information  to  obtain  better 
estimates  of  the  seismic  source  spectrum.  Such  an 
endeavor  requres  that  one  be  able  to  identify  the  factors 
causing  the  apparent  jumps  in  polarization,  both  as  a  func¬ 
tion  of  frequency  and  time 

Appendix 

Formal  Uncertainty  of  Polarization  Estimates 

We  esnmate  the  uncertainties  in  the  angles  •->,  .  <->„ , 
d>in-  'Sun  (tom  uncertainties  hi  (/ )  in  the  unit  eigenvector 
£(/),  which  represents  the  principal  polarization  of  particle 
motion  at  frequency  / .  The  derivation  of  the  rms  expec¬ 
tation  of  hi  can  be  found  in  the  work  by  Park  and  Chave 
11984],  We  only  define  the  problem  and  state  the  results 
here.  The  vector  z=v,,  the  right  eigenvector  of  M 
(defined  in  (3))  associated  with  largest  singular  value  d 
The  uncertainty  <r  is  estimated  from  the  two  smaller 
singular  values 

<r:=  -rA-r  (<(<  •</?)  2  (All 

K  -- 1 

where  K  is  the  number  of  “igenspeetra  used  in  forming  A 
The  covariance  matrix  for  the  first-order  uncertain!)  hi 
has  expectation  value 


hi  X  (hi  )* 


£  v,«(v  )* 


( A2) 


It  is  also  true  that  hi  Xfiz  =0.  The  X  symbol  denotes 
the  tensor  (outer)  product  of  two  vectors  Since 


hi  OS  (hi)*,,  —  h:,h:’ 

we  have  complete  information  on  the  formal  uncertainties 
of  the  components  of  the  principal  polarization  Note  that 
since  (hi  )*  •  z  =  0  as  z  is  a  vector  of  unit  length,  hi  is  com¬ 
posed  of  i1:  and  v  t,  the  right  eigenvectors  associated  with 
the  "noise"  singular  values  «/?,  dy 
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Given  (A2).  we  can  determine  the  formal  first-order 
uncertainty  of  any  well-behaved  function  of  i  = 
( r | .r j.r  i ) ;  <t>M,  —  ^  —  (A i  =  arg(zv) -  arg(r,)  to  within  an 

additive  constant.  Since 


S.-vR;,*  I  |fir,l’ 


Since  (-»//  =  Redan  '(zvzdd 


fit-),,  =  Re  -~v  t  '  «  —  =  Re 


-  2Re[.-:’r,.vrAr;  ]  -  |R;,j-  )  (A6) 

Note  that  |RH„  |:  —  co  as  : ;  —  i: ,  i.e.,  circular  polariza¬ 
tion 

The  uncertainties  of  the  vertical  polarization  angles  <->, 
and  HUi  are  similar.  With  <-),  given  by  (81,  where 
-zr,  we  use  the  relation  <5r=;  :/i;})  to 

find 

RH,  -  Re  I*  <A7, 


|bt->i  |:  =  - a 


V  (z  M ) 


V  (z  M )  =  I;  I*  |Rr, 


-  2  Re  R;,Rr  *  )  2  Re  fir.fizf  ) 

•  2  Re  ( |z,  |:r>  f  )  (A9) 

The  restriction  of  the  argument  of  the  arctangent  to  be 
positive  in  the  definition  of  •-),  does  not  alter  us  uncer¬ 
tainty  Following  (A3)  and  < A4 ) .  the  uncertainly  ofA,,, 


Z>Zi 

Ri/).,,  =  1m  — - 


where  X  is  given  in  (A9) 
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Summary.  We  present  a  new  method  tor  estimating  the  frequencies  of  the 
iarth's  tree  oscillations.  This  method  is  an  extension  of  the  techniques  of 
Thomson  tl'>N2)  tor  finding  the  harmonic  components  of  a  time  series. 
Optimal  tapers  lot  reducing  the  spectral  leakage  of  decaying  sinusoids 
immersed  in  white  noise  ate  derived  Multiplying  the  data  hv  the  best  K 
taper-  create-  A  linn'  -erics.  A  decay  ing  sinusoid  model  is  fit  to  the  A  time 
sene-  by  a  least  squares  procedute.  A  statistical  /••test  is  performed  to  test  the 
tit  ot  the  decaying  sinusoid  model,  and  thus  determine  the  probability  that 
t hoi o  are  coheiont  oscillatioitsm  the  data.  The  /--test  is  performed  at  a  number 
ot  chosen  frequencies,  producing  a  measure  of  the  certainty  that  there  is  a 
decasme  sinusoid  at  each  frequency.  We  compare  this  method  with  the  con¬ 
ventional  technique  employing  a  discrete  Courier  transform  of  a  cosine -tapered 
lime-series.  I  lie  multiple-taper  method  i-  found  to  he  a  more  sensitive  detector 
ot  decaying  -inu-oid-  m  a  time  senes  contaminated  by  white  noise. 

Key  words  multiple  t .1  pci .  lice  oscillations,  spectial  analysis 


I  Introduction 

1  he  tree  oscillations  ot  the  faith  appeal  as  decay  mg  sinusoids  in  the  records  of  instruments 
in  the  available  low  .requeues  seismic  .mays  1  International  Deploy menl  of  Accelerometers, 
hereafter  telerred  to  a-  IDA.  and  (tlohal  Digital  Seismic  Netvvtirk.  hereafter  referred  to  as 
t.DSNl  I.Agncvv  cl  nl.  l‘>  iv  I  ngdahl.  IVteison  A  ()r-ini  Il,,s2l  Information  about  1  ho 
'tiu-ime  ot  the  faith  can  be  mteiied  tiom  the  tiequencies.  decay  rates  and  amplitudes  ot 
these  oscillations 

(  on  vent  tonally .  1  hew  .  haiacteri-ii-  -  o)  ihe  decay  mg  sinusoids  are  estimated  ftom  a  direct 
spectral  estimate  ot  the  data  using  a  cosine  taper  (Harris  l'i7.x.  Dahlen  I'hSd;  1  indberg  l'iSh). 
01  by  producing  -phen-al  harmonic-weighted  sums  ot  the  direct  spectral  estimates  made 
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from  each  station's  record  (‘stacking’  or  ‘stripping’)  (Gilbert  &  Dziewonski  1475).  [  here  are 
several  difficulties  with  using  a  single  cosine-taper  in  the  harmonic  analysis  of  fiee  oscillations 
The  time  series  analysed  in  free  oscillation  studies  are  non-stationary :  the)  are  also  contami¬ 
nated  with  noise.  The  cosine  taper  is  symmetric  and  appropriate  for  stationary  time-series:  n 
is  not  a  good  taper  for  minimizing  the  spectral  leakage  of  decaying  sinusoids  immeised  in 
noise.  The  cosine  taper  also  discards  much  of  the  data  at  the  ends  of  the  time  series,  parti¬ 
cularly  at  the  beginning  where  the  signal-to-noise  ratios  of  the  free  oscillation  records  are 
large.  This  is  not  desirable.  In  addition,  applying  a  cosine  taper  to  reduce  spectral  leakage  is 
purchased  with  greatly  increased  variance  (e.g.  figs  7  and  8  of  Dahlen  1482).  Lse  of  the 
cosine  taper  roughly  doubles  the  variance,  or  equivalently,  halves  statistical  efficiency  of  the 
estimate  (Jones  1 1)(>2 ).  Another  drawback  of  a  cosine-taper  direct  spectral  estimate  is  that  it 
does  not  discriminate  between  oscillations  of  constant  phase  and  frequency  (harmonic 
oscillations)  and  hroad  distributions  of  spectral  energy  caused  by  other  processes. 

To  overcome  these  problems,  we  have  developed  a  method  of  harmonic  analysis  for 
decay  ing  sinusoids  immersed  in  stationary  while  noise  based  on  the  methods  developed  by 
Thomson  ( 1 4S2 ).  A  set  of  several  ‘optimal’  tapers  is  created,  each  one  designed  to  minimize 
the  spectral  leakage  of  decaying  sinusoids  immersed  in  white  noise,  while  maintaining  a  large 
value  for  the  ratio  of  tapered  signal  energy  to  tapered  noise  energy.  Multiplying  the  data  by 
each  taper  in  turn  creates  several  time  series.  Taking  the  discrete  Fourier  transform  of  these 
time  series  yields  several  complex  eigenspectra  (called  eigencoefficients  by  Thomson  14S2). 
A  decaying  sinusoid  model  is  fit  by  a  least -squares  procedure  to  these  complex  eigenspectra. 
The  least-squares  procedure  produces  an  estimate  of  the  initial  amplitude  of  any  decaying 
sinusoids  in  the  data.  The  fit  of  the  decaying  sinusoid  model  at  any  given  frequency  is  tested 
using  a  statistical  F-test.  This  gives  a  quantitative  measure  of  the  confidence  that  a  phase- 
coherent  decaying  sinusoid  is  present  in  the  data  at  any  given  frequency. 

The  multiple-taper  method  utilizes  more  of  the  data  than  the  cosine-taper  direct  spectral 
estimate,  and.  as  shown  in  Section  4  and  the  appendix,  is  a  more  sensitive  detector  office 
oscillations  in  a  seismic  record.  In  one  example,  the  five  singlets  of  (15:  could  be  detected  m  a 
single  record  of  the  1  <->“77  Sumbawa  event,  with  measured  frequencies  in  good  agreement 
with  those  reported  by  Buland.  Berger  &  Gilbert  ( I1)?1-)),  who  used  a  six-station  global  array 
stack.  Only  two  of  the  singlet  lines  are  visible  in  the  conventional  direct  spectral  estimate 
employing  a  cosine  taper. 

The  multiple-taper  technique  for  free-oscillation  analysis  is  described  in  the  following 
sections.  Section  2  introduces  the  functionals  which  are  optimized  to  yield  a  family  of 
spectral  leakage -suppressing  eigemapers  appropriate  for  an  oscillation  with  a  given  attenua¬ 
tion  rate.  Functionals  for  decaying  sinusoids  in  time  series  with  and  without  white  noise  are 
discussed.  Section  5  introduces  the  statistical  F-test  for  detection  of  decaying  sinusoids.  In 
Section  4  we  present  a  number  of  frequency  measurements  of  isolated  tree  oscillations  using 
IDA  network  data.  Our  conclusions  are  summarized  in  Section  5.  An  error  analysis  of  the 
method  is  included  in  the  appendix  Readets  interested  primarily  m  the  examples  are 
directed  to  Section  4.  To  implement  the  technique  on  a  computer  one  needs  to  solve  (2.14) 
to  design  the  tapers,  apply  ( 5.1  5 )  to  estimate  the  decaying  sinusoid  amplitudes  as  a  function 
of  frequency,  and  compute  (5.2M  io  produce  an  F-test  plot  to  test  tor  the  existence  of 
decaying  sinusoids  at  any  given  frequency. 

2  Optimal  data  tapers  for  decaying  signals 

In  this  section  we  adapt  the  methods  described  in  a  series  of  five  papers  by  Slepian.  Landau 
and  Poliak  (Slepian  &  Poliak  1 4b I ;  Landau  &  Poliak  1461.  1462  ,  Slepian  1404.  1 47b  I.  Their 


Multiple-taper  spectral  analysis.  /  757 

work  involved  a  set  of  time-limited  functions  whose  spectral  energy  is  optimally  conceit 
trated  within  a  given  frequency  band.  These  functions  have  been  employed  to  design  optimal 
tapers  for  the  analysis  of  stationary  processes  (Thomson  19N2).  We  have  extended  Thomson's 
work  to  produce  tapers  for  the  harmonic  analysis  ot  exponentially  decaying  signals,  f  or 
signals  that  decay  exponentially  with  time,  we  obtain  an  optimization  equation  from  which 
one  can  find  the  data  taper  try,  with  optimal  resistance  to  spectral  leakage  from  outside  a 
frequency  band  of  chosen  width.  Solving  the  optimization  equation,  one  discovers  that  there 

exists  a  family  of  data  tapers  {vv„(  t).  w,  ( /) . \ck  ,1/ii  with  good  spectral  leakage 

resistance.  We  refer  to  the  members  of  this  family  as  eigentapers  These  tapers  are  eigen¬ 
vectors  of  a  Toeplitz  matrix  whose  elements  are  values  taken  by  the  function  mu  v  x  In  the 
next  section  sse  produce  several  spectra  from  a  single  record  multiplied  hv  each  ot  the 
eigentapers  m  turn,  and  we  show  how  these  spectra  can  be  combined  to  provide  useful 
information . 

An  important  factor  in  the  analysis  ot  low  frequency  seismic  data  is  the  piesence  ot 
stationary  white  noise  in  the  records  I  his  was  recognized  by  Dalilen  1  I 9x2 1:  the  presence  ot 
stationary  noise  deteimmed  the  optimal  tune-series  length  for  estimation  ot  paiameters  in 
Dalilen  s  analy  sis.  In  his  work,  however,  the  taper  shapes  were  held  fixed  In  this  studv.  we 
extend  the  methods  ot  lliomson  (I4S2|  to  derive  optimal  taper  shapes  for  any  length  tune 
series,  characterized  hy  a  paiametet  depending  on  the  signal  to-noise  ratio  at  the  start  ot  the 
seismic  record.  These  'noise-cognizant'  tapers  have  less  resistance  to  spectral  leakage  than 
those  designed  using  a  procedure  that  ignores  stationary  noise  In  the  appendix  we  show  how 
noise-cognizant  tapers  improve  the  sensitivity  of  the  eigentapei  analysis  it  stationary  noise  is 
present  in  the  data 


2 A  decaying  sign  al  w  i  i  h  no  noise 

(  onsidet  first  a  signal  x  t/l  that  consists  ot  a  sum  of  decaying  sinusoids  uncomipted  by  noise. 
Then  one  can  represent 

,Y<n=  Xp,  exp  (101,7  Q;/L  fz  (I, 

I 

where  p,  is  the  complex  amplitude  of  the  /tli  decaying  sinusoid,  which  has  angular  frequency 
oj,  and  decay  rate  a;.  In  practice,  one  cannot  measure  .v  tn.  but  only  the  .V  disaete  numbers 

x  (/0>.  x  l/|) . x  l/\  -  i  )■  Assume  that  ;„=  0.  and  the  time  between  samples  Sr  =  t, .  ,  l,  is 

a  constant,  which  we  scale  to  be  unity  If  St  -  1.  then  the  Nyquist  frequency  /N>qulst  = 
and  the  angular  frequency  cu  =  In]  is  defined  on  its  principal  domain  (  rr.  jt )  Tapering  the 
time  series  f  x  (r)}  ',-V  consists  of  multiplying  it  by  a  real  valued  sequence  {vein}  ,N  0  1  (the 
taper).  Taking  the  discrete  Fourier  transform  of  the  tapered  signal  {.v ( M vv ( Ml  '  0'  y  ields 
the  function 

.v  -  t 

(w)=  S  exp  (  rco;)  w(;).v(r)  (2.1 1 

r  =  u 

This  sum  may  be  quickly  computed  using  the  Fast  Fourier  Transform  (FFTl  algorithm 
(C  ooley  &  Tukey  1905;  Brigham  1974),  A  traditional  estimate  of  the  energy  content  of  .vUl 
as  a  function  of  frequency  is  given  by  |r(or)|J.  where  !  nfMl^o'  ls  a  conventional  taper 
(Flann.  Hamming.  Blackman-Harrts.  Morse  no.  2.  etc..  Harris  |97n  describes  many  of  the 
popular  tapers).  The  finite  length  of  the  time  series  makes  a  boxcar  taper  implicit  if  w(M  =  I 
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in  (2.1).  One  wishes  to  ehoose  {  to  facilitate  determination  of  the  frequency 

content  of  x  (/). 

The  primary  purpose  of  a  data  taper  is  to  minimize  spectral  leakage.  That  is.  the  spectral 
component  of  a  tapered  signal  at  frequency  co  should  have  minimal  energy  contribution 
from  outside  the  interval  (co  -  12.  cd  t  12).  where  0<  212  <  2n  is  a  chosen  bandwidth.  One 
must  also  prevent  the  energy  at  cd  from  the  leaking  out  to  affect  parts  of  the  spectrum  at 
other  frequencies.  Suppose  that  x(r)  consists  of  only  one  decaying  sinusoid  in  (cd  12. 
co  +  12).  with  frequency  cd.  The  tapered  signal  {w(/)/j  expl/cdr  -  a/)} '7=  o'  should  have  as 
much  of  its  energy  as  possiole  in  (co  -  12.  cd  +  12)  relative  to  its  total  energy,  which  covers 
the  entire  band  ( -  it,  it).  One  chooses  a  taper  {u>(r)}  0'  to  maximize  the  functional 


puj  +  il 

|v(co)l:c/co 

a  lj-H 

f 71 

j  |v(co)|2c/co 

2  —  rr 


where  y  (co)  is  the  discrete  Fourier  transform  of  {x  (/)  vv(f)|  0' 
,v  - 1 

>'(co)  =  /a  ^  exp  (  icof)  exp  liuflwlf)  exp  (  at) 


(Slepian  1 1>83  describes  how  maximizing  a  similar  functional  yields  solutions  to  the  concen¬ 
tration  problem,  which  is  important  in  electrical  engineering.)  Since  our  time  signal  is  limited 
to  |0.  ,Y  1|.  there  is  no  way  to  confine  completely  the  energy  of  its  frequency  transform 
to  (cd  -  S2,  cd  +  12).  Therefore,  the  value /will  always  be  less  than  unity. 

We  expand  the  numerator  of  (2.2) 

(•uHl  pti  S  -  I 

|  |y(co)|:c/co  =  ImI2  c/co  2  exp  ( -  tcof )  wU)  exp  (  at) 
a  id  ■  ti  ~  -n  t-  o 

a-  - 1 

v  2  exp  (icos)w(sl  exp  (  as) 

J  =  o 

x  -  i  v  -  i  sin  il (s  1 1 

=  2|p|2  ^  2  vv(/)exp(  at) - e\p(  as)w(s)  (2.2) 

r  =  0  s =  0  (V/) 

and  use  Parseval's  theorem  to  expand  the  denominator 

r  n  iV  -  1 

dco|y(co)|2=  2rr|ju|J  £  w(r)exp(  2aM  w(M  (2.4) 

J  -  rr  f  =  0 

so  that  (2.2)  becomes  dependent  entirely  on  ve(0).  w(  I ) . w(/V  1 )  and  simple  functions. 

Define  the  /V-vector  w  =  |u’(0),  w(  I ) . rv(/V  I  )|.  the  matrix  \  with  elements 


sin  121/  m) 

- - f 

it  {I  nt) 


exp(  ot(/+m)>; 


/.  m  =  0.  I . ,V  I 


and  the  diagonal  matrix  B,  where  Blm  =  5lm  exp(  2a/);  /.  m  -  0.  I . .V  I .  (The  symbol 

6aft  is  the  Kronecker  delta  function;  f>ah  =1  if  a  =  b.  and  0  otherwise. )  Then  equation  ( 2.2 ) 
can  be  written  as 


/<w)  = 


w  •  A  ■  w 
w  •  B  •  w 
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To  find  the  taper  that  optimizes  the  functional  f,  set  the  variation  of/with  respect  to  w 
equal  to  zero 


6  /( w;  h )  =  —  /( w  +  eh ) 
de 


=  0 


for  all  iV-vectors  h  (Goldstein  1980.  chapter  2;  Smith  1974).  Some  algebra  leads  to  the 
eigenvalue  problem 


A  •  w  -  XB  •  w  =  0. 


(2.6) 


where  A,  B  are  A'  x  N  real  symmetric  matrices  and  the  eigenvalue 
X  =  /<  w). 

The  eigenvalue  X  is  always  less  than  unity,  as  can  he  seen  from  (2.2).  The  fractional 
spectral  leakage  of  the  signal  at  cb  outside  the  frequency  band  (to  il.  u>  +  fi)  is  I  X.  The 

taper  w0=  [n'0(Ol.  n0(  11 . u’0(.V  I  )|  corresponding  to  the  largest  eigenvalue  X0  is  the 

optimal  taper  for  minimizing  spectral  leakage.  The  taper  w„  has  roughly  the  same  shape  as 
other  popular  tapers  such  as  the  Mann  and  Blackman-Harris  tapers  (The  taper  w0  corres¬ 
ponds  to  the  solid  curve  labelled  'O'  in  Figs  I  and  2.)  The  largest  eigenvalue  X0  is  almost  I: 
one  finds  that  X0=  I  (2.‘>x  1 0“ 10 )  for  Nil  =  Stt.  Moreover,  there  are  several  eigenvalues  in 
the  descending  family  X()>  X,  >  X;>  . . .  .■  X  v  _  i  that  are  very  close  to  X0and  hence  close  to 

unity.  The  associated  eigenvectors  wu,  w,,  w; . wv_  ,  form  a  sequence  of ‘eigentapers-. 

the  first  few  of  which  possess  good  spectral  leakage  resistance. 

Let  the  decay  rate  a  =  0  in  (2.0).  noting  that  A  and  B  depend  on  a.  Then  (2.6)  becomes 
equation  (2.9)  of  Thomson  (  |9X2):  its  solutions  are  optimal  lapers  for  concentrating  the 
energy  of  nondecaying  sinusoids.  As  discussed  by  Slepian  ( 1978)  and  Thomson  ( 1982).  the 


4rt-proc:e  tapers:  n-0 


Figure  I.  The  five  Unvest-order  eigentaper  solutions  to  (2.6)  when  decay  rate  a  =  0.  and  .VS2  =  8rr.  The 
solid  black  line  is  the  optimal  taper.  Higher  order  tapers  are  successively  more  oscib  tory. 
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solutions  to  (2.6)  when  a  =  0  are  the  discrete  N  •  Wit  prolate  spheroidal  sequences 
{ b')}  ?=-0‘.  where  W  =  J2/2r r  and  k  is  an  integer.  If  a  0.  the  solutions  to  (2.6)  are  the 

eigentapers  w*(r)  =  v\k){N.  £2/2jt)  exp  (a/):  r=0.  1 . N  1.  A  spectral  estimate  using 

these  tapers  is  similar  to  the  ‘analytic  continuation'  of  the  DFT  discussed  in  Buland  & 
Gilbert  ( 1078).  In  much  of  the  following,  the  time-bandw  idth  product  P  =  V  •  W  =  A 'il  2rr  =  4. 
(In  Slepian  1078.  1083,  the  parameter  c  =  rr  •  P  is  used.)  P  is  usually  taken  to  he  an  integer, 
but  this  convention  is  not  required. 

The  {u^'lA'.  IV)}^^)1  sequences  have  several  properties  that  are  shared  with  the 
decaying-signal  eigentapers  {«.*(;)}  '  -  For  example,  both  possess  an  orthogonality 

property: 

.V  -  1  X  -  1 

2  v\k)(j\.W)vi,k  ]  {N.  W)=  ^  exp  (  -  2at)wk(t)wk  (t)  =  Skk  , 

t=  o  t=o 

The  tapers  { wk  ( /■ )}  f\  0’  sample  that  part  of  the  signal  that  decays  as  exp  (  -  at)  in  an  ortho¬ 
gonal  manner.  Figure  I  shows  the  five  lowest-order  eigentapers  wk[t)  =  v\k\N.  W  =  4;,'V); 

7  =  0.1 . A'  -  I  for  a  stationary  signal  (a  =  0).  The  zeroth-order  taper  {w0U)}',\  q1  is  a  47r 

prolate  taper.  Note  that  the  higher-order  eigentapers  are  negative  in  some  places  and  they 
weight  the  data  more  heavily  near  the  ends  of  the  record.  Figure  2  shows  eigentapers  for  a 
signal  that  decays  by  exp  (  rr|3).  wheie  |3  =  a Tjn  =  1.0  (7-cycles,  during  the  record  length 
T  -  A’ A l .  |One  2-cycle  refers  to  the  time  required  for  Q  oscillations  of  the  harmonic  signal. 
This  notation  was  introduced  by  Dahlen  ( 1082).  One  2-cycle  is  equivalent  to  an  amplitude 
decay  of  exp  (  n )  =  1/23|.  Note  the  increasing  amplitude  towards  the  end  of  the  record,  as 
the  tapers  try  to  amplify  the  decaying  signal.  The  tapers  {wfc(r)} 1  produce  the 
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Figure  2.  The  live  lowest-urdcr  eigentapers  tor  a  decaying  sinusoid  that  decays  by  c\p  l  it!  during  the 
record.  Multiplying  a  decaying  sinusoid  by  these  tapers  will  concentrate  its  energy  in  a  f  requency  band  of 
width  2n  =  16rr/A.  The  taper  amplitudes  increase  exponentially  towards  the  end  of  the  record  to 
compensate  for  the  signal’s  decay. 


decay  "q  s  resold  tapers  0-1  0N'  =  8n 
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unwelcome  result  of  amplifying  the  late  record  noise  as  well,  so  that  while  the  signal  power 
remains  constant  with  time  in  the  tapered  record,  the  noise  power  increases  exponentially 
In  the  next  subsection  we  will  show  how  noise-cognizant  eigentapers  weight  the  later  data 
more  soberly. 

Substituting  W)exp  (af)}* o'  tor  in  (2.5)  and  using  the  definition 

of  A.  one  can  show  that  the  discrete  prolate  spheroidal  sequences  and  the  sequences 
f'="o1  have  the  same  eigenvalues  \k  for  any  value  of  the  decay  rate  a  Therefore,  the 
kth  prolate  taper  and  the  &th  decaying  sinusoid  eigentaper  have  the  same  fractional  spectral 
leakage  for  a  given  value  of  P-  SlNIltt.  The  2NW  lowest-order  eigenvalues  \k  of  (2.6)  are  of 
order  unity,  and  rapidly  drop  off  thereafter  (Slepian  1083).  For  example.  47r-pro!ate  sequences 
have  eight  order-unity  eigenvalues,  one  per  Rayleigh  frequency  spacing  ( 2n  .V)  in  the  central 
region  (u>  -  Hit/N.  to  +  Sn/AO.  Values  of  \k  are  given  in  Table  I  for  some  examples  of  Pi r 
prolate  tapers. 

The  amplitudes  of  the  frequency  transforms 
x  - 1 

b'A(<ajl=  £  **'*  <7)  exp  (npr/T)  exp  (-  icat)  (2.8) 

I  =  o 

of  the  five  knvest-order  4 it  prolate  eigentapers  are  shown  in  Fig.  3  ovei  a  w  ide  range  of 
frequencies.  (Here,  record  length  T  =  N.)  Substituting  {o,UM.V.  b')e\p(a/)})N-  0'  for 
Iti'j  1 1 »  r\.  0'  in  (2.8).  one  finds  that  the  functions  'ii^(co)  are  independent  of  decay  late. 
Figure  3  shows  the  excellent  leakage  rejection  properties  of  the  eigentapers.  There  is  a  sharp 
band-edge  at  frequency  u>  =  Hn/T.  Note  sidelobe  height  increases  as  the  order  of  the 
taper  increases,  hut  remains  .’0  40 dB  below  the  height  of  the  central  region  even  for  the 
fifth  taper.  Figure  4  is  an  expansion  of  the  central  peak  region  displaying  both  real  and 
imaginary  components  of  the  same  five  eigentaper  transforms  H’^tw).  The  plots  of  the 
central  region  show  that  each  ifylui)  samples  the  central  band  (  S2.fi)  in  a  different 
manner.  The  eigentaper  transforms  IV^tco)  become  increasingly  more  oscillatory  with 
increasing  order.  The  ( co )  are  orthogonal,  both  within  the  central  band 

(2trf'  f  ifwH’*|ul  Wk-  < co)  =  v/k  •  A  •  wk 

J  —  si 

=  X*w*Bw k=\kf>kk 


Table  I.  !  teenvalues  Kk  for  lowest-order  Pn  prolate  tapers 


p  k 

1-4, 

P  k 

1  4, 

1  0 

0.189 

3  0 

1 .348  x  I0“7 

1 

0.2504 

1 

9.245  x  10  * 

2 

0.7564 

2 

3.850  x  10“* 

3 

5.086  x  10*’ 

4 

5  386  x  10"2 

2  0 

5.725  x  10  5 

4  0 

2.946;  10"10 

1 

2  438  x  10"’ 

1 

2.768  x  10"* 

2 

4.061  x  10  J 

2 

1.210  x  10"6 

3 

0.2783 

3 

4.245x  10  ’ 

A 

0.7253 

4 

5.899  x  I0-1 

5 

7.496  x  10"’ 
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47i  p'o'Oi a  tape’s:  /?  D 


freeze' r:y  *  A 

Figure  3.  Frequency-transform  amplitudes  of  the  five  lowest-order  4r  prolate  cigentupcrs.  the  frequency 
transforms  are  independent  of  decay  rate  In  r  2.K i.  I  lie  sidelohe'  are  lowest  tin  the  optimal  eieentaper. 
and  increase  in  height  for  higher  order  eigentapers.  Ihe  abscissa  is  in  units  ot  ^  2n  I  where  /  is  Ihe 
record  length  There  is  a  sharp  bandedge  at  frequency  w  =  dc-g. 


using  (2.(i)  through  (2.M.and  over  the  entire  discrete  Fourier  transform  frequency  band 

C 

l2rF'  r/wh'*(a>)HV(u;) =  (2.10) 

by  (  2." ).  In  (  2  ■>)  and  (2.10).  the  asterisk  denotes  complex  conjugation. 

I'ntoriunately .  these  tapets  are  only  sutlable  lor  ihe  analysis  of  noise-free  records,  but 
low  frequency  seismic  data  .tie  noisy  In  the  next  section,  tapets  designed  to  analyse  noisy 
records  are  discussed 


2B  mi'AYINl.  SK.N  At.  IN  VVHITI  NO  IS  l 

Lowlrequency  seismic  records  can  be  modelled  as  a  Mini  ot  decaying  free  oscillations 
immersed  in  noise 

-v (/)  =  V  gi,„  exp  lra),nr  am t)  *  mo.  i*  o.  (2.11) 

m 

where,  as  belore.  wm.  am  and  gt,„  are  t he  trequency.  decay  rate  and  complex  amplitude  of 
the  will  free  oscillation,  with  onset  at  /  =  (I.  and  n  U)  is  a  realization  of  a  noise  process  I  he 
sum  over  m  extends  in  principle  over  the  countably  infinite  elastic-gravitational  free  oscilla¬ 
tions.  but  can  be  taken  as  finite  in  a  record  from  a  band-limited  seismic  instrument  We  will 
assume  throughout  that  nit)  is  a  realization  of  a  stationary,  zero-mean,  white  noise  process 
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Figure  4.  f  spansiun  of  central  peak  region  of  the  five  eigentaper  frequency  transforms  of  Fig.  3.  The  sulid 
line  is  the  real  part;  the  dotted  line  is  the  imaginary  part.  The  central  peak  region  is  increasingly  more 
oscillatory  for  higher-order  eigcntapcrs. 


In  practice,  the  spectrum  of  seismic  noise  does  not  va  y  much  over  the 
frequency  band  of  interest  ( Agnew  &  Berger  ll>78). 

We  determine  optimal  data  tapers  in  this  case  using  an  extension  of  the  variational  forma¬ 
lism  described  above  In  particular,  we  wish  to  balance  the  need  to  concentrate  as 
much  of  the  spectral  energy  of  the  signal  as  possible  into  a  region  of  bandwidth  2f'.  against 
the  desire  to  retain  a  high  ratio  of  tapered  signal  power  to  tapered  noise  power.  The 
exponential  asymmetry  of  the  decaying  signal  tapers  { w*  (f )}  0‘  will  increase  the 
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amplitude  of'  stationary  noise  in  the  later  part  of  the  record.  This  will  degrade  the  quality  of 
the  spectral  estimate  considerably  unless  the  ratio  of  tapered  signal  to  tapered  noise  is 
constrained  to  have  a  reasonable  value. 

Assume  that  in  the  interval  (co  £2.  u>  +  ft)  the  record  xir)  is  composed  of  the  signal,  a 
single  decaying  sinusoid,  plus  white  noise  nil). 

.v(rl  =  p  exp  (icdr  at)  +  nit).  (2  12) 

Suppose  also  that  we  have  discrete  samples  of  x(D 
{.rink  7  =  0.  1.2 . V  I 

so  that  the  angular  frequency  w6|  it.  7t|  We  want  to  choose  oui  taper  {"'</>},'  o'  so  that 
the  energy  of  the  tapered  signal 

{u-Ulp  exp  iiajt  arili^d’  (2  let 

in  (u)  £2 .  oj  +  £2 1  relative  to  its  total  energy  is  maximized,  but  now  with  a  const  taint:  the 

ratio  of  the  tapered  signal  power  to  tapered  noise  powet  in  tec  £2.  co  +  £  2 1  has  a  fixed  value 
The  discrete  Kourier  transform  of  the  tapered  noise  is 

A  1 

wlul=  V  nil)  exp  (  iui)w(t).  12.14) 

r  -  o 

A  measure  of  the  expected  energv  of  the  tapered  noise  at  (requeues  cj  is 
v  t 

(|  m  tcul  |2>  =  a\  |u(n|:.  (2.15) 

i  ■■  o 

where  <>  denotes  expectation  value  and  o\  is  the  noise  variance.  The  expected  power  of  the 
noise  in  the  tapered  record  in  (ui  £2.  +  il)  is 

r w  -  si  ^  A  ^  i 

I  <1  m  (to) \z)du>  =  -ilo\  ^  [u(/)|:.  (2.1(i) 

J  w  ■  si  r  -  o 

We  generalize  (2.5)  in  order  (o  constrain  the  ratio  of  tapered  signal  to  tapered  noise 
within  the  frequency  band  (to  12.  to  +  £2 >.  We  now  wish  to  maximize  the  functional 

w  •  A  •  w  w  •  A  •  w 

/(w:  £2.  pi  =  +r?  <  2  1 7 1 

w  •  B  •  w  w  •  w 

with  respect  to  w.  where  w,  A  and  B  are  as  defined  in  Section  2A.  The  second  term  in 
equation  (2.1^1  represents  the  ratio  of  tapered  signal  power  (o  tapered  noise  power;  p  is  a 
Lagrange  multiplier  In  the  limit  of  very  large  signal-to-noise  ratio,  i.e..  as  (|pl:  I  (o‘\l— 
one  expects  p  to  tend  to  zero  In  principle  p  is  determined  from  the  constraint  equation:  in 
practice  we  determine  its  value  empirically  The  condition  S/(w;  £2.  pi  =  0  leads  to  a  non¬ 
linear  equation  for  the  tapers  w  which  maximize  (2. 17).  This  non-linear  equation  can  be 
solved  approximately  (  Lmdberg  14X6), 

Alternatively,  we  can  minimize  the  functional 

w  •  B  •  w  w  •  w 

/(w:  £2.  v)  =  ( 2  IN) 

w • A ■ w  w-A  • w 
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I F  Gilhei  i .  private  communcation  I  Solv  mg  6/  =  0  leads  to  the  equation 
A  •  w  =  V  B'  •  w,  (2.  I'M 

where  B  =  B  *  fl.  I  being  the  A  <  .V  identity  matrix  and 

A'  =  | /< w;  S2. k)|" 

The  eigenvector  whteii  cot  respond  to  the  largest  eigenvalues  A  of  ( 2. 1  ‘M  w  ill  mini  mi/e  j 
Given  the  decay  rate  a  and  t he  noise-weighting  parameter  v.  ( 2.  Id |  can  he  solved  tor  eigen- 
tapers  {vrA  it:  p.  el)  ,N  0!  When  v  =  l).  the  elements  of  the  At h  taper  \vA  (/:  0.  0)  =  wk\l): 
/  =  ().  2.  .  A  I.  and  the  tapers  reduce  to  those  ot  Section  2.A.  The  fraction  ol  tapered 

signal  power  that  remains  in  the  frequency  band  Icj  il.  u>  +  S2)  is 

(wA  •  A  •  wA  l  |wA  -  B  •  wA  )  =  Aa  (2.20) 

which  can  he  calculated  fiom  the  eigenvectors  and  eigenvalues  of  <2. I'M  We  have  found  it 
helpful  to  think  of  the  Aa  as  'bandwidth  retention  factors'. 

We  used  FISPAC'K  subroutines  (Smith  <•/  al  l‘>~0>  to  solve  (2. I'M  toi  its  largest  eigen¬ 
values  Aa  and  associated  eigenvectors  We  normalized  the  tapers  hs\\t :  0.  to}  ,N  so  that 

\  i 

wA-B-wA=  V  (e\p(  2arl  +  n)  wA  t /:  d.  n  I  wA  t/:  (j.  n)  =  kAA  .  (2.21  I 

I  -  o 

Rather  than  solve  an  eigenvalue  problem  for  every  data  seiies  length.  (2. I'M  was  solved  for 
.V  =  12  s  and  the  tapers  for  other  values  of  A  were  found  using  spline  interpolation.  This 
approach  takes  advantage  of  the  asymptotic  relations  between  the  discrete  and  continuous- 
time  tapers  described  in  Slepian  (l'*"S).  Tests  using  these  interpolated  tapers  showed 
negligible  degiadation  of  spectral  leakage  properties  relative  to  exact  solutions  tor  A’  >  I  2-V 
for  A  -  I  2b.  (2. I'M  should  be  solved  directly  (A.  (have,  private  communication),  but  such 
short  time  series  are  rare  in  free  oscillation  work  The  taper  transforms  are  computed  from 
the  interpolated  tapers  using  an  TFT  after  padding  the  tapers  with  zeroes  until  then  lengths 
were  a  power  of  two. 

The  preceding  argument  shows  that  v  is  a  complicated  function  of  the  signal-to-noise  ratio. 
For  large  signal-to-noise-ratios  |gtl*  o\ .  v  will  be  very  small.  B  *  B.  and  the  solution  of 
1 2.  I'M  is  not  very  different  from  the  solution  of  (2.0)  For  smaller  signal-to-noise  ratios,  one 
expects  that  the  optima!  tapers  will  have  ,t  v  of  finite  size.  One  could  pick  an  incorrect  value 
ol  v  for  a  particular  signal-to-noise  ratio,  but  then  the  tapers  would  not  perform  optimally. 
I’sel’ul  values  are  best  determined  by  experiment.  We  will  show  in  the  appendix  that  using 
eigentapers  having  larger  values  of  v  results  m  a  marked  improvement  in  the  detection 
capability  of  the  multiple-taper  algorithm. 

Some  examples  of  noise-cognizant  eigentapers  are  exhibited  in  Fig.  ?  for  the  case  v  =  0.01 . 
0  =  0.6.  and  STY  =  Sw  Note  the  strong  asymmetry  of  the  tapers,  with  a  strong  emphasis  on 
data  in  the  earlier  section  of  the  record  where  instantaneous  signal-to-noise  ratio  is  greater. 
The  height  of  the  taper's  main  peak  increases  with  increasing  order  to  compensate  for  the 
decay  ot  the  signal,  as  shown  in  Fig.  2.  Figure  6  shows  tapers  which  were  designed  with 
v  =  0.1 .  0  =  0  b.  and  HN  =  Hjt.  The  preference  for  the  early  part  of  the  recoid  is  more  drastic, 
resulting  in  significant  weighting  at  the  onset  of  the  time  senes.  Here  the  variational  principle 
minimizing  (2.lbl  has  sacrificed  resistance  to  spectral  leakage  in  order  to  raise  the  ratio  of 
tapered  signal  to  tapered  noise.  Figure  7  displays  eigentapers  designed  with  v-  0.001. 
P  =  0.2.  and  WV  =  bn.  These  eigentapers  are  for  a  seiies  containing  sinusoids  that  only  decay 
slightly  in  a  more  favourable  signal-to-noise  environment .  Asymmetrical  weighting  remains 
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Table  2.  Eigenvalues  K ^  and  bandwidth  retention  factors  K/.  for  lowest  order  noise- 
cognizant  optimal  tapers. 


v  -  0.01.  fl=- 0.6 

-c 

II 

O 

to 

II 

v  =  0.001  .d  -0.2 

v  0.0 1 ,  d  — ■  1 

k 

x/ 

x, 

x/ 

x. 

X/ 

X* 

V 

X, 

0 

0.962301 

0.99869 

0.73574 

0.98905 

0.99729 

0.99997 

0.94592 

0.99682 

1 

0.940787 

0.99760 

0.63985 

0.98003 

0.99676 

0.99995 

0.89435 

0.99227 

■» 

0.910363 

0.99619 

0.53070 

0.96888 

0.996 14 

0.99993 

0.80953 

0.98428 

* 

0.867487 

0.99361 

0.41590 

0.95026 

0.99542 

0.99990 

0.68406 

0.96699 

4 

0.808444 

0.99006 

0.30618 

0.92719 

0.99458 

0.99975 

0.52440 

0.93826 

S 

0.732283 

0.98339 

0.21152 

0.89053 

0.98934 

0.99469 

0.35619 

0.88398 

Values  of  the  eigenvalues  A*  ’ 

and  the  bandwidth  retention  factors  A* 

for  time -bandwidth  product  il/V  =  Kt 

and  various  values  of  the  decay  parameter/?  and  the  noise  parameter  v  Note  that  the  bandwidth  retention 
parameters  A*  are  close  to  1  for  small  k  ,  and  are  succcedingly  smaller  for  higher  order  tapers  The  lowest 
order  cigcntapcrs  have  the  smallest  fractional  leakage  1  -A*,  higher  order  cigentapers  suffer  from  succcs 
sivcly  greater  spectral  leakage. 


evident.  Table  j  shows  values  of  XA  and  XA.  Tor  tapers  {vcA(r:  d-  n)},\  0'  tor  a  selection  of 
d  and  v  values.  The  eigenvalues  XA  drop  rapidly  from  unity  with  incieasing  k  The  bandwidth 
retention  factors  XA  remain  relatively  constant  among  eigentapers  of  fixed  d  and  v.  This 
behaviour  can  be  observed  qualitatively  in  the  lig.  8  plots  of  the  amplitudes  of  the  frequency 
transforms  tP*  (vc;d.  v)  of  the  tapers  {vvA  (/:  d-  nl},\  o' 

v  i 

k’Alw:d-  v)=  V  it:  j).  t'l  exp  |t  — fd  7' )  / 1  e  \  p  f  iwtl  (2.221 

r  0 

tor  d  =  0.6.  v  -  0.01 .  and  Sl.Y  =  8jr.  The  live  lowest  order  eigentapers  have  sidclobes  of 
comparable  height  Tnlargements  of  the  central  peak  regions  aie  shown  in  fig.  lT 

Because  A  and  B’  are  symmetric,  the  orthogonality  condition  id.'M  remains  valid  foi 
noise-cogni/ani  lapers.  using  (  22  I  I.  However,  as  the  noise-cogni/ant  tapers  {vvA  ( /:  j3.  nit o’ 
satisfy  (221)  and  not  (22).  the  frequency -domain  orthogonality  relation  (2.10)  does  not 


Table  3.  Moments  n!  ni.it i is  D  to:  .  U.b.  ;;  \  =  St  .inJ 
v  -  n. 111. 


k' 


k 

0 

l 

2 

3 

4 

0 

0.98151 

0.00183 

0  00218 

-0.00235 

0.00227 

l 

0.00183 

0.97130 

-0.00352 

0.00415 

-0.00445 

T 

0.00218 

-0.00352 

0.95658 

-0.00620 

0.00719 

3 

-0.00235 

0.00115 

-0.00620 

0.93560 

-0.01043 

4 

0.00227 

-0.00145 

0.00719 

-0.01043 

0.90638 

for/3  0.6, 

f  IV  ■-  Htt  and  v 

k ' 

0  1 

k 

0 

i 

2 

3 

4 

0 

0.86278 

0.01271 

-0  01358 

0.01 261 

0  0099S 

1 

-001271 

0.80917 

0  02210 

0.02322 

0.02143 

1 

-0.01358 

002210 

0.74322 

0.03364 

0.03440 

0.01261 

0  02322 

0.03364 

0.66634 

0.04693 

4 

o.oow 

0.0214  3 

0.03440 

0.04693 

0.5X231 
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Figure  9.  I  \pansmn  of  l ho  central  peak  legion  of  the  frequency  transform  amplitude''  of  the  live  lowest 
order  eigeniapers  of  I  ie  H.  I  lie  solid  line  is  the  real  part  oi  the  ! requeues  transform:  the  dashed  line  is  the 
imaginars  part  oi  the  transtorm. 


hold  In  its  place  we  ha\e 

T  ^  \  1 

|2«»  1  ilutti'l too:  it)  (co.  .1  =  V  e\p |(  2irjJ.  7**/ ]  wklt:p.i-)wk  U:fi.v)  =  Dkk 

*'  !  ft 

(:::>) 


tm  A,  A'  fill.  1.  K  I!  U.o  m.im\  l)  is  dlagmuilK  dominant  for  niiil.II  v.  lahle  a  lists 
tin?  elements  nt  [)  tm  i  lie  ti\e  limost  ordet  eigen  tapers  will.  12  A  =  S".  (3  =  O.tv  and  r  =  0.0 1 
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m  i’ -0.1.  The  magnitude  of  ihe  off-diagonal  elements  of  D  indicate  i tie  departure  limn 
orthogonality  ot  the  frequency  transforms  h^tw./T  v  1  over  (  n.  7t| 

We  have  required  that  our  data  tapers  possess  certain  desirable  properties.  We  warn  them 
to  have  the  ability  to  concentrate  most  of  a  decaying  sinusoid's  energy  into  a  given  I  re  - 
quency  band,  balanced  against  the  capacity  to  maintai.,  a  high  signal-to-noisc  ratio  lor  the 
tapered  data  in  the  frequency  domain.  This  leads  to  a  variational  calculus  problem,  whose 
solutions  are  a  family  of  data  tapers.  These  tapers  provide  a  method  of  orthogonally 
sampling  a  decaying  sinusoid,  in  both  lire  time  and  frequency  domains.  By  sampling  a  decay¬ 
ing  sinusoid  repeatedly  in  different  ways,  one  can  obtain  superior  estimates  ol  iis  frequency 
and  amplitude.  Simple  techniques  to  do  this,  based  on  those  outlined  by  Thomson  ( I‘iis2l. 
aie  the  subject  of  the  next  section. 


3  Harmonic  analysis 

An  nnpoiiant  part  of  long-period  seismic  data  analysis  is  the  detection  of  decaying  sinusoids 
in  the  data  and  the  measurement  ol  then  frequencies  and  amplitudes.  The  estimation  of 
decav  rate  a  is  also  impoitant:  we  plan  to  address  this  problem  in  later  work.  In  the  follow¬ 
ing  it  is  assumed  that  the  Q  of  the  decaying  oscillation  is  known  or  has  been  approximated 
by  some  method  (e  g  Riedese!  er  ai  l‘>N0i. 

The  spectra  of  low -frequency  seismic  time-series  consist  ol  harmonic  'lines'  w  hich  have 
been  broadened  by  decay,  and  a  •;  mtinuous  background  spectrum.  The  decay -broadened 
'lines'  are  treated  as  signal,  whereas  die  continuous  spectrum  is  considered  to  be  noise.  This 
sets  free  oscillation  data  analysis  apart  from  many  familiar  problems  in  seismic  spectral 
estimation,  e.g.  finding  the  frequency  content  of  body  waves,  or  earthquakes  in  the  near 
field.  The  spectra  in  those  cases  aie  predominantly  continuous.  There  arc  methods  of  multi¬ 
taper  spectrum  analysis  dial  are  useful  lor  spectra  which  do  not  have  harmonic  line  com¬ 
ponents  (e.g.  Thompson  Ids’;  Park.  I  mdberg  &  Vernon,  in  press;  Lindberg.  Vernon  &  Park, 
unpublished  manuscript  I 

The  most  straightforward  method  of  detecting  line  components  in  low-frequency  data  is 
to  measure  obvious  spectral  peaks  m  a  discrete  Fourier  transform  of  (he  data.  If  one  tapers 
the  time  series  in  a  prudent  fashion,  as  indicated  lit  Dalilen  ( IPSdi.  this  approach  is  adequate 
for  well-excited  oscillations  generated  by  large  earthquakes  (A/;  <.  7).  Unfortunately,  most  of 
these  well-excited  oscillations  are  'in  face-wave-equivalent  fundamental  modes  which  by 
themselves  allow  poor  depth  resolution.  The  modes  most  useful  for  enhancing  the  resolution 
at  depth  (e.g.  the  overtone  oscillations  that  correspond  to  PKP.  PKIKP,  SKS  etc  motion)  are 
excited  only  by  very  large  or  very  deep  earthquake  sources.  Fven  then,  their  spectral  peaks 
may  not  protrude  substantially  above  the  background  noise.  Masters  &  Gilbert  (  ll>8|  1  show 
a  typical  example  of  this  problem  in  the  presumed  identification  of  two  inner-core  oscilla¬ 
tions.  The  use  of  spherical  harmonic  stacking  of  records  from  a  global  array  (Gilbert  & 
Backus  1 0o5 ;  Gilbert  &  D/ievvonski  1075:  Bulaiul.  Bergei  wl  Gilbert  1 070)  can  aid  mode 
identification  greatly,  especially  in  the  case  of  closely  spaced  spectral  lines  caused  by 
splitting  of  a  free  oscillation  into  individual  singlets.  However,  it  is  difficult  to  identify 
decaying  sinusoids  in  a  low  signal-to-noisc  environment  using  conventional  methods  of 
spectral  estimation. 

In  the  following  we  propose  a  method  that  is  designed  to  yield  a  quantitative  measure  of 
the  certainty  that  there  is  a  decaying  sinusoid  at  any  given  frequency  .  The  novelty  of  our 
algorithm  resides  in  ihe  additional  information  obtained  by  sampling  the  data  with  more 
than  one  taper  and  the  introduction  of  a  statistical  theory  based  on  an  /-'-test  to  detect 
harmonic  spectral  components  and  reject  continuous,  random-phase  noise. 
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Suppose  that  x  it)  is  a  record  consisting  of  noise  and  a  number  of  decaying  sinusoids,  one  of 
which  has  frequency  to.  Then  one  can  write 

x(t)  =  p  exp  (i6)t  a  t)+eit).  (Tl) 

where  p  is  a  complex  amplitude,  a  is  a  decay  rate,  and  et/l  is  an  erroi  term  The  erroi  term 
consists  of  other  decaying  sinusoids  and  noise.  For  a  sufficiently  small  value  of  £ 2.  v  < / 1 
contains  only  the  single  decaying  sinusoid  /a  exp  t no/  at)  in  the  frequency  interval 
(to  £2,  to  +  £2).  This  decaying  sinusoid  represents  a  deterministic  signal  in  the  recoid  .vtn. 
and  one  can  use  the  method  of  least  squares  to  estimate  ils  amplitude  p. 

We  assume  in  the  following  that  there  is  not  more  than  one  decaying  sinusoid  in  the 
frequency  interval  (at  £2.  to  +  £2).  This  is  often  not  true  in  practice,  bin  in  many  applica¬ 
tions  the  various  singlets  of  free  oscillation  multiplets  are  observed  to  combine  into  a  signal 
[hat  is  well  approximated  by  a  single  resonance.  Also,  the  lea't  squaies  procedure  can  be 
generalized  to  the  case  of  two  or  more  decaying  sinusoids  in  a  tiequency  interval  of  width 
2£2  (Thomson  I^Hd). 

As  we  have  indicated,  it  is  important  to  taper  the  data.  Csing  i he  optimal  tapers  of 

Section  2.  we  multiply  the  data  by  each  taper  ( >cA  (/;  (3.  nl)  <(l.  A.  =  U.  1 .  2 . A  I .  in 

turn.  We  pick  only  a  small  number  (At  of  data  tapers  because  highei  ordei  tapers  have 
successively  poorer  leakage  resistance.  In  the  save  v  =  0  the  tapeis  up  to  oidei  A  A  ■  £2  it 
have  good  spectral  leakage  resistance:  higher-order  tapers  exhibit  vastly  pooler  pcitomiance 
(Slepian  1‘>"7S).  This  is  evident  from  the  behaviour  of  the  eigenvalues  XA  appearing  in  (able  1 
In  the  case  i-'=*=0.  we  choose  the  A  noise-cogni/.ant  tapeis  with  the  largest  bandwidth 
retention  factors.  Usually  A  <  ,V£2  it  in  this  case  We  show  how  we  choose  A  in  Section  4. 

Multiplying  the  data  {.v(/)},N  0'  by  the  A  eigentapeis  {wA  ( / ;  (J.  nl) ,  .  ()'  one  obtains  A 
time  series: 

{wkU\p.  u)xit)])'  o':  k  =  0,  I . A  I. 

From  equation  IT. I ) 

eir)wk  it:  0.  a)  -  x  i/twA  it:  0.  v)  gm k  it:  (3.  nl  exp  (rut/  an.  /  =  it.  I.  .V  1 

< .  v  2  > 

Take  the  discrete  Fourier  translorm  of  both  sides  of  ( T.2 ) 

ekiu)  =  ykiu)  p  ifA(to  u>:0.a).  (T.Tl 

w  here 

.v  i 

ek(u>)  =  2!  e  (/)  toA  it:  0.  v)  exp  (  roar) 

t  -  o 


■V  I 

=  £  x  U)wk  if.  0.  v)  exp  (  rwri 
r  0 

and  Wk  I  to;  0.  t>)  is  defined  in  equal  ion  (2.22).  Because  of  the  leakage  resistance  o!  the  tapers, 
the  efc(to)  are  approximately  the  complex  eigenspectra  of  the  noise  in  (to  £2.  to  +  £2) 

We  would  like  to  make  an  estimate  p  of  the  amplitude  p  of  a  decaying  sinusoid  of  Ire 
quency  to.  To  do  this,  a  leasi-squaies  procedure  is  performed  At  each  frequency  to.  the 
complex  eigenspectra  e^ltol.  k  -  U.  I . A  I.  are  taken  to  he  the  dependent  variables. 
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d  is  the  parameter  to  be  estimated,  and  the  gj; (3.  u),  k  =  0.  I . K  1 ,  are  the 

independent  variables.  By  the  Gauss  Markov  theorem,  to  produce  a  minimum  variance 
estimate  of  p  that  is  unbiased  at  the  decaying  sinusoid's  true  frequency  using  least  squares, 
the  random  variables  yk[(jB)  must  be  statistically  uncorrelated  (Bickel  &  Doksum  1 1>77 . 
eh.  7:  Luenberger  1900,  ch.  4;  Tukey  1475).  However,  the  y*(w)  are  not  necessarily 
uncorrelated  random  variables: 

Cov  ( yk(w). =  <Vfc (to).v*  (cj(>  <Vfc(w»<  vk-(co)>* 
a  -  i 

=  a v  V  wk(r:p.n)wk(r.p.a).  15.4) 

r  =  0 

The  sum 

A  -  i 

Hkk  =  y  \vk(f.p.v)wk {f.p.v) 

I  -  0 

will  not  vanish  unless  ji  =  v  =  0  and  k  *  k'.  For  0  =  v  =  0.  Hkk  =  5kli  Flements  of  the  matrix 
H  for  (3  =  0.6.  Sh\'  =  Htt.  and  v  =  0.01  and  v  =  0. 1  are  shown  in  Table  4. 

Since  H  is  symmetric  and  positive  definite,  it  has  a  Cholesky  decomposition.  That  is. 
there  exists  a  lower  triangular  matrix  G  with  positive  diagonal  entries  such  that 

H  =  GG7  (.3.5) 

where  the  superscript  T denotes  matrix  transpose  (Golub  &  VanLoan  148.3). 

Transform  the  complex  eigenspectra  y*(to)  and  the  independent  variables  ft*. (u>:  (3.  n) 
using  the  matrix  G'1  as  follows: 

vk U:0.  i>)  =  {CTl)kkwkU\  0.v) 

:k  (col  =  (G  1  )k  k .lyicol 

l\(uj-.p.v)  =  (CT')k-k 

gk  ( uj  )  =  ( G  1  )k  k  £k  ( co ).  ( 3.0 ) 


Table  4.  Llemcnts  of  matrix  H  lor  J  -  0.6.  fi.V  =  8n  and  v  -  0.01. 


k' 


k 

0 

1 

2 

3 

4 

0 

1 .84870 

0.18322 

-0.21788 

0.23523 

-0.22668 

1 

0.18322 

2.86999 

0.35164 

-0.4)490 

0.44470 

-0.21788 

0.35164 

4.34203 

0.61999 

-0.71893 

3 

0.23523 

-0.41490 

0.61999 

6.44037 

1.04331 

4 

-41.22668 

0.44470 

0.71893 

1.043.31 

9.36201 

for/?  =  0.6, 

12jV  -  8rr ,  and  v 

k' 

-0.1 

* 

0 

1 

2 

3 

4 

0 

1.37223 

0.12714 

0.13583 

-0.12606 

-0.09953 

i 

0.127)4 

1.90834 

-0.22102 

0.23223 

0.21434 

T 

0.13583 

-0.22102 

2.56776 

0.33639 

0.34401 

3 

-0.12606 

0.23223 

0.33639 

3.33664 

-0.46931 

4 

0.09953 

0.214.34 

0.34401 

0.46931 

4.17692 
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where  vk(t;/3.i'),  zk(u>j).  Vk(uif.  0.  n)  and  ggiui,)  are  the  transformed  tapers,  the  trans¬ 
formed  complex  eigenspectra,  the  transformed  independent  variables,  and  the  transformed 
errors  respectively.  We  employ  the  Einstein  summation  convention  in  (3.6)  and  hereafter, 
summing  repeated  indices  over  the  range  0.  I . K  I .  From  (3.3). 

gk(aj)=zk(ca)  pVk (to  -6a: 0.  v).  (3.7) 

The  transformed  complex  eigenspectra  ^.(co)  are  uncorrelated,  as 
Cov  (r*(w).  zk  too))  =  (cj)>  (zk(ca))(zk(co))* 

=  (G'1)*/ Cov  |v/(cj)..vm(w)l(G',)*:  m 

=  o\  5  kk  ( 3 .8 ) 


by  (3.4)  and  (3.5). 

A  measure  of  the  error  in  assuming  that  the  record  x(t)  consists  of  a  single  decaying 
sinusoid  of  frequency  ci>  is 

K  -  t  k  -  i 

Miu>)=  £  l&(w)|J  =  2  I  :k  AtCt(u)  dr.(3.n)|2-  (3») 

k - o  k  =  o 


Perform  a  least  squares  procedure;  solve 
dM  (  oj  ) 


for  p.  Then  (3.10)  becomes 


(3.10) 


K  -  1 

0  =  2  C*(w  -  w;0.  a)  [:r  (co)  pVk(uj  cb;/3.n)].  (3.11) 

k  =  o 

Note  that  p  is  actually  a  function  of  the  frequency  cb: 

A.'  -  t 

^  V*(ui  to;  |3.  v)zk  (to) 

k  -  0 

p  =  /a  (cd)  =  — - - .  (3.1  2) 

^  I  F*  (co  -  w:/3.n)|i 

k  -  o 

One  can  determine  zk(ui)  at  a  set  of  discrete  frequencies  to,;/  =  0.  1 .  2 . J  ) .  called 

bin  frequencies,  by  applying  an  FFT  to  the  tapered  data.  The  data  can  be  padded  with 
zeroes  to  interpolate  the  spectrum.  (Note  that  this  ‘interpolation"  adds  no  extra  information.) 
To  estimate  the  amplitude  p  of  the  proposed  signal  at  each  discrete  bin  frequency,  set 
co  =  w  =  co,;/  =  0.  1 . J  -  1  in  (3.12).  Then 

K  - 1 

2  K*(O;0.n)r*<co;) 

k  =  0 

K  -1 

2  |F*(O;0,n)|2 
*  =  o 


ptui/)  = 


(3.13) 
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Substituting  for  zk( w)  in  (3.13),  it  is  seen  that  this  ‘pointwise  regression’  formula  for  p  is 
equivalent  to  a  Fourier  transform  of  the  time  series  {x(f)}^=  q1  with  a  hybrid  taper 
{ vV(  t\ 0.  v)}  fL  o1  given  by  the  formula 

a:  —  t 

2  Vk(O:0,v)vk(t:0.  a) 

w(t.0.u)  = - - ;  t  =  0.  1 . IV  1  (3.141 

2  \Yk(0-(3.v)\2 

k-0 

(F*(0;  0.  n)  =  Vk  (0;  0.  t>)  since  { vvfc  ( f  •,  ))  ^  q  1  is  a  real-valued  sequence.!  Note  that 

{  u'U;  (3.  v)}  (\  0 1  is  not  optimal  in  the  sense  of  ( 2.1  l>). 

In  terms  of  the  complex  eigenspectra  and  taper  frequency  transforms: 

a  t  a  t 

V  V  SC,|0;f.t-)(H''Ui'i(u,l 

m  ~  0  /  =  0 

M  ( co. )  = - - —  — - .  (3.15) 

A  I  A"  I 

V  V  H  *,  (0:/3.  v)(HMlw/it',(«:(l.r) 

m  =0  10 

When  d  =  v  =  0.  H  =  1.  and  (3.15l  reduces  to  equation  ( 1 5.5 )  of  Thomson  ( I482I. 

It  n  =  0  (t.e.  tapers  designed  without  provision  for  stationary  background  noise)  or  (3  =  0 
(tapers  designed  for  non-decaying  signals),  ft*. (0;  0 )  =  it* < 0:  0.0)  =  0  for  odd  k.  since  in 

both  cases  the  It'*  i educe  to  the  transforms  of  discrete  prolate  spheroidal  sequences.  In  these 
instances  the  pomtwtse  regression  technique  ignores  the  odd  order  tapers  completely  in 
constructing  q 
By  (3.S), 


A  1 

2  I  F*(():p.  c)|2Vat  |r*.(to,)] 

A  0 

Var  |q(co;l|  =  -  --  - 

(  V  I  1*10:0.  n)|2J 
V  k  0  / 

0\ 

=  -  - .  (3.16) 

A'  I  A'  1 

V  V  R'£,(<):0.  q)(H",)m/ii?/(0:d.  ") 

m  =  0  /  ~  0 

The  variance  of  the  estimated  amplitude  increases  with  increasing  noise  amplitude 

If  there  is  no  decaying  sinusoid  at  frequency  cj;.  one  would  expect  p  to  be  small. How¬ 
ever.  this  is  not  the  best  criterion  for  deciding  if  there  is  a  decaying  sinusoid  at  frequency  cu,. 
The  sinusoid  may  be  present,  but  it  may  have  a  very  small  amplitude.  Also,  the  least  squares 
procedure  may  yield  a  large  value  for  p  at  some  frequency,  but  a  decaying  sinusoid  may  not 
be  a  good  way  to  characterize  the  data  at  that  frequency.  A  method  of  evaluating  the  fit  of 
our  decaying  sinusoid  model  to  the  data  is  needed. 


3.2  TESTING  THE  I  IT  0 1-  THE  MODEL  TO  THE  DATA 

A  common  technique  for  assessing  the  fit  of  a  least-squares  estimate  is  to  perform  a  statistical 
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/•'•test  (e  g.  Wonnucoti  &  Wonnacott  lL)81 ).  An  /••statistic  is  roughly  the  ratio 
variance  explained  bv  the  model 

F  =  _ - - i - .  (3.17) 

unexplained  variance 

The  random  variable  /•'follows  the F-distribution.  which  has  been  tabulated  (e.g.  Abramowitz 
&  Stegun  l‘>b5).  We  use  the  F-test  to  compare  the  fit  of  the  data  to  a  decaying  sinusoid 
model. 

Suppose  that  the  record  .v  (r)  consists  solely  of  zero-mean  stationary  Gaussian  white  noise 
n(t ).  For  free  oscillation  data,  we  have  found  that  it  is  a  reasonable  approximation  to  say 
the  background  noise  is  Gaussian  white  noise  and  almost  stationary.  This  can  he  demon¬ 
strated  by  generating  ordered  value  plots  of  the  data,  as  in  Fig.  10  [Wilk  &  Gnanadesiken 
( 1  %8)  contains  details  on  ordered  value,  or  P  P  plotting  of  data]. 

As  before,  one  estimates  the  complex  amplitude  p  of  a  decaying  sinusoid  of  frequency  ui 
by  fitting  the  model  p  finite  to;  0.v)  to  the  random  variables 

x  - 1 

ykloa)=  V  wkU:0.  flexp(  ipjDnlt):  A' =  0.  I . A'  I.  (5. IS) 

i  -  o 

There  is  a  finite  probability  that  a  decaying  sinusoid  model  will  fit  the  complex  eigenspectra 
of  the  noise  (5. IS)  at  some  frequency  The  chance  that  this  will  happen  is  a  measure  of  the 
confidence  that  a  true  decaying  sinusoid  exists  at  that  frequency. 

When  no  harmonic  signal  is  present,  the  expected  value  of  ea.t,  transformed  complex 
eigenspectrum  vanishes: 

<r*  l  to, )>  =  ().  ( 3 . 1  *-> ) 

However,  the  presence  of  noise,  or  signal,  may  cause  any  given  transformed  complex  eigen¬ 
spectrum  ^(ujjl  to  he  non-zero  at  some  frequencies.  This  departure  of  zk(cj/)  from  its 
expected  value  may  be  partly  ’explained’  by  the  linear  regression  analysis.  Using  the  estimated 
value  p  ( cjy  I  from  (5.13).  the  estimated  value  ofr^tcoy)  is 

r*du>;)  =  pi(ujy  )  FA(0:  0.v).  (5.20) 

The  deviation  of  c*,(co;l  from  may  be  decomposed  into  an  ‘explained’  deviation. 

( l  (:k  (co;  l)|  and  an  ‘unexplained’ deviation,  j < co, )  i^twy)]: 

l-Mu;,)  <_-A.(W>)>|=  lifctW/)  <;A(ojy)>|  +  |r*(u>,-)  r*(W/>l-  (5.21) 

Or.  summing  over  k.  and  noting  that  <rj.(co,)>  =  0 

K  -  i  K  i  k  i 

v  zk  ( )  =  V  iA(  co,)  +  v  |rA(co,)  rA(c o;)].  (5.22) 

A  O  A  O  A  -  0 

The  same  equality  holds  when  one  takes  the  modulus  squared  of  the  deviations: 

A  i  A’  i  A  i 

X  UVcjy H:=  V  !iA(co,)|2+  V  U-A(co,)  i*(cj,i|2  (5.25) 

k  ~  0  k  0  k  0 

by  multiplying  (3.21)  by  its  complex  conjugate,  and  then  summing  over  k.  Substituting  lor 
zkiu)j).  (3  23)  becomes 

k  - 1  k  i  a;  i 

V  UA(co/)|J=  ip(co,)|2  2  I  FA(();|3.r)|2+  £  l‘A<^/>  P  (w,-)  Vk  (0:  P-  v)  |2  (5  24) 

k~  0  k -  0  *  =  0 
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or 

t(tOj)  =  0  l cjy )  +  v  (3.25  \ 

defining 

a:  - 1 

£(cj;)=  £  |  z*  ( to;- ) ! 2 
k-  o 

a:  -  i 

^  I  »/*(0;|3.n)|J 

k  =  0 


A  -  1 

VicJ7)=  £  k*(u?/)  •  ju(wy)  J^(O:0,  i/)|2, 

*  =  o 

where  £  (oj;)  is  ihe  total  sample  variance  of  the  jfc(wy).  9  (co;>  is  the  sample  variance  explained 
by  the  decaying  sinusoid  hypothesis,  and  d'tco,)  is  the  residual,  or  unexplained  sample 
variance. 

We  formulate  a  test  to  reject  the  null  hypothesis  that  ti  =  0  Consider  t he  random  variable 
F  formed  by  taking  the  ratio  of  the  explained  sample  variance  to  the  unexplained  sample 
variance.  Then 


-  6  (  W; ) 

F(w,l  = - 

v  (w /) 


K  -  1 

V  |r*(w/)  q(w,)  Ck(0;j3.  n)l2 

k  =  o 


If  there  is  a  decaying  sinusoid  at  frequency  ccy.  the  denominator  y(w;)  will  be  small,  and 
thus  the  function  F ( to/ )  will  be  large.  By  chance,  sometimes  a  decaying  sinusoid  model  will 
fit  the  time  series  {n(t))?=  y1  reasonably  well  at  some  frequency.  The  probability  of  this 
happening  can  be  calculated.  Therefore,  one  can  describe  quantitatively  the  confidence  that 
there  is  a  true  signal  at  a  given  frequency. 

We  need  to  know  how  the  random  variable  F(u>,)  is  related  to  the  F-distribution.  In 
Lindberg  ( 198b)  it  is  shown  that 

(A'  l)0<tc,> 

F(w/)  =  IA'  l)F(cc/)  =  —  -  C’.27) 

C'(wy) 

follows  an  F-distribution  with  2  and  (2A  2)  degrees  of  freedom  Therefore,  the  chance 

that  the  random  variable 


F(ccy) 


K  ~  I  K  -  1 

<K-l)WUw,M2  V  X  ^mlOiU.^lH-'w^tOJ.n) 

m  •=  0  /=  0 

K  -  1  K  -  1  ^ 

X  X  [yi(ojj) 

m  ~  0  /=  0 


(3.28) 
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Figure  10.  Ordered  value,  or  /’  P  plot  of  675  independent  value'  of  the  ratio  /-'(uiyl  in  (3.281  using 
synthetic  stationary  Gaussian  white  noise  j'  input  data.  The  euinulative  probability  distribution  of  the 
ordered  observations  f'( i)  <  </•(,  -,)  is  plotted  on  the  ordinant  against  sample  quantiles  on  the 

abscissa.  The  ;th  poult  is  plotted  as  the  ordered  pair  |i  /  675>.  F'(;  I  he  graph  i,  almost  a  straight  line, 
demonstrating  that  the  ratio  Tiw,i  follows  an  /’'-distribution  lor  (iaussian  white  noise  input  data 


lakes  on  a  particular  value  at  some  frequency  due  to  random  noise  can  be  found  using 
standard  tables  of  the  ^-distribution  (e.g.  Abramow.it/  &  Stegun  1465). 

Figure  10  is  an  ordered  value,  or  P  P  plot  (Wilk  &  Gnanadesikan  1%8)  of  675  indepen¬ 
dent  values  of  the  random  variable  F(w;)  generated  from  a  synthetic  record  of  Gaussian 
white  noise.  If  the  sample  followed  an  F-distribution  exactly,  the  ordered  value  plot  would 
lie  on  a  straight  line  connecting  the  points  (0.  0)  and  ( I .  II.  The  departure  of  the  ordered 
value  plot  from  a  straight  diagonal  line  is  not  significant  at  the  ‘>5  per  cent  confidence  level, 
using  a  Kolmogorov  Smirnov  test  for  goodness  of  fit  (Bickel  &  Doksum  1  c>77 ).  This  demon¬ 
strates  graphically  that  the  ratio  Fluty )  follows  an  F-distribution  when  the  data  consist  of 
Gaussian  white  noise 

4  Data  examples 

Ae  illustrate  the  multiple -taper  algorithm  with  two  examples  of  decaying  oscillations  immersed 
in  while  noise.  In  the  lirst.  we  analyse  a  synthetic  IDA  record  in  which  the  signal-io-noise 
power  ratio  is  known  a  priori.  In  the  second,  we  study  a  ,'40-hr  record  of  the  1477  Sumbawa 
event  trom  IDA  station  NNA  (Natla.  Peru).  Spectral  estimates  made  by  taking  the  DFT  of 
cosine-tapered  data  are  compared  to  results  produced  by  the  multiple  taper  technique.  We 
find  the  multiple-eigentaper  algorithm  is  superior  for  detecting  low-amplitude  decaying 
sinusoids  in  noise. 

We  have  focused  our  attention  on  the  gravest  observed  seismic  free  oscillation,  the 
spheroidal  multiplet  „S2.  ( i-S',  has  lower  frequency,  but  this  oscillation  of  the  inner  core  has 
not  yet  been  conclusively  observed.)  The  multiplet  (>Ss  consists  of  five  decaying  sinusoids 
at  distinct  frequencies.  These  'singlets'  are  labelled  by  an  a/imuthal  order  number 
f,,£{  2.  1.0.  1 .  2} .  The  live  singlet  frequencies  of  this  oscillation  are  widely  split  by  the 

rotation  ot  the  Earth,  so  much  so  that  the  magnitude  of  the  quadratic  second-order  Coriolis 
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splitting  is  roughly  oO  per  cent  that  of  the  quadratic  splitting  caused  by  the  tarth's  hydro¬ 
static  ellipticity  (Dahlen  &  Sailor  1  ‘>7*)).  The  singlet  frequencies  have  been  measured  by 
Buland  er  al.  ( 1979)  from  spherical  harmonic  stacks  of  six  I  50-lir  IDA  records  of  the  1977 
Sumbawa  event.  The  muitiplet  0 S2  is  difficult  to  measure  as  it  is  excited  by  only  the  very 
largest  earthquakes.  Even  for  the  Sumbawa  event,  the  signal-to-noise  ratio  is  not  large.  Also, 
some  singlets  have  very  small  amplitudes  at  some  stations  because  of  the  dependence  of 
singlet  amplitude  on  latitude.  As  a  result,  no  more  than  two  or  three  ot  the  live  singlet 
resonance  functions  can  be  seen  in  any  of  the  conventional  amplitude  spectra  of  records 
from  the  seven  IDA  stations  existing  at  that  time. 

We  constructed  a  300-hr  synthetic  IDA  record  from  CMO  (College.  Alaska)  using  a  source 
located  in  Oaxaca.  Mexico.  The  five  singlets  of  „.S\.  split  by  rotation  and  ellipticity.  were 
included  in  the  seismogram  (see  Park  &  Gilbert  (1986)  for  an  outline  of  the  computation 
procedure).  Gaussian  white  noise  was  added  to  the  record  with  amplitude  scaled  so  that 
N\p\2io\,=  73  for  the  m  =  0  singlet  oscillation.  A'|  p  I2  a\  =  32.5  for  the  m  =  ±  I  singlets, 
and  ;V|p  I2  a\  =  3.6  for  the  m  =  ±2  singlets.  The  record  was  sampled  at  1 60s  intervals  to 
produce  a  time  series  of  6750  points.  We  analysed  the  record  with  five  eigentapers  with 
TIN  =  8rr.  p  =  0.6.  v  =  0.01  to  produce  amplitude  estimates  filcc)  and  an  /'-test  of  the  fit  ol 
p(cj)  to  the  complex  eigenspectra.  Five  tapers  were  chosen  because  the  five  lowest  ordei 
eigentapers  with  TIN  =  8rr.  (3  =  O.o  and  p  =  0.01  have  fractional  leakage  of  0.01  or  less  (Table 
2).  We  also  produced  a  spectral  estimate  using  a  cosine  taper  lor  comparison.  According  to 
arguments  outlined  in  the  appendix.  (F)  should  be  neat  the  99  pti  cent  confidence  level  for 
the  m  =  ±  1  lines  and  considerably  greatei  for  the  m  =  0  line,  flic  m  =  t  2  lines  have 
( F )  =  2.25.  but  large  random  fluctuations  in  F  are  possible. 

The  spectral  estimate  using  a  cosine  taper  l.ty-lccll  is  compared  with  the  multitaper  ampli¬ 
tude  estimate  |p(w)  I  in  Fig.  1  1 .  We  graph  the  frequency  band  2M)  <  /<.  340 pH/  containing 
the  five  singlets  of  US2  and  no  other  known  seismic  tree  oscillation.  The  ordinate  scales  of 
the  plots  do  not  match  because  .iy(cj)  is  an  estimated  amplitude  spectrum  and  picul  is  the 
amplitude  of  a  presumed  harmonic  signal  at  t  =  0  Main  features  T  the  plots  are  similar, 
however,  because  both  represent  discrete  Fourier  transforms  of  tapered  data  |p(ui)  corres¬ 
ponds  to  the  DFT  of  the  data  tunes  a  hybrid  taper  as  shown  m  (3.13)  (3.1 4)|.  The  m  =  2. 

0  and  1  singlets,  having  frequencies  given  in  Table  5,  are  readily  discernible.  The  prominence 
ol  the  m-  2  singlet  is  puzzling  in  light  of  its  low  input-amplitude.  The  m  =  -  1  singlet 
appears  to  be  obscured  somewhat  by  noise  interference. 

The  F-test  of  the  fit  of  plcu)  to  the  complex  eigenspectra  is  graphed  in  Fig.  12.  All  live 
singlets  of  0.S\  are  observable  with  better  than  05  per  cent  detection  confidence.  Their 
measured  frequencies  are  given  in  Table  5.  along  with  estimates  of  the  expected  errors  in  the 


Table  5.  frequencies of  , .V  in  synthetic  record 


Singlet  Azimuthal  Order  m 


Input  Singlets 

-2 

-I 

0 

1 

2 

Input  Frequency  (mil/.) 

.299800 

.304615 

.309337 

.313874 

.318226 

Input  Phase 

-85° 

- 133° 

-2.7“ 

126“ 

72“ 

F  -Test  Results 

Frequency  (mHz) 

.29973 

.30436 

.309356 

.31371 

.31889 

Frequency  Uncertainty 

.00022 

.00034 

.000074 

.00018 

.00035 

Phase 

-74° 

-118° 

-1.5“ 

148“ 

87“ 

F -value 

5.5 

13.7 

86.0 

66.6 

5.9 
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(b) 


Figure  II.  <  u I  Amplitude  of  j  spectral  estimate  using  a  cosine  taper  |  vc  (lj  )  |  for  a  synthetic  record  of  ,,S, 
(it)  The  function  liiiwll.  where  ytwl  is  tile  estimated  complex  amplitude  *>1  ..  decaying  sinusoid  in  a 
synthetic  record  of  „.V.  using  five  eigentapers  with  parameters  n  X  -  Kn.  J  =  0.6  and  r  =  0.01.  In  both  tal 
and  (hi.  three  of  the  five  singlets  ot  , ,S  are  visible  The  true  positions  of  the  inpul  singlets  are  marked 


frequencies  produced  by  the  method  described  in  the  appendix.  The  most  poorly  fit 
frequency  observation  is  within  do  of  the  true  value  Note  the  rough  equivalence  of  thef'-tesl 
values  for  the  m  =  ±  2  singlet  lines  The  amplitude  of  the  m  =  d  singlet  in  Fig  1  I  is 
enhanced  by  noise  fluctuations,  but  the  noise  contribution  has  incoherent  phase,  causing  the 
m  =  2  F- test  value  to  fall  relative  to  (hat  ot  neighbouring  oscillation  peaks.  On  the  other 

hand,  an  apparent  noise-minimum  at  the  frequency  of  w  =  +  d  single-line  allows  its  small 
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F-test  of  fit  of  Estimated 
Amplitude  to  Eigenspectra 
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Figure  12.  F-test  values  resulting  from  a  test  of  the  fit  of  estimated  amplitude  £(lo>  to  the  eigenspectra 
obtained  using  five  eigen'apers  with  parameters  ll.\  =  8 tt.  (3  -  0.6.  and  v  -  0.01.  The  data  is  a  synthetic 
record  t»t  „S.:  it  consists  of  f i\ e  deca>  ing  sinusoids  whose  frequencies  are  listed  in  Table  5.  All  five  ha\e 
peaks  above  the  05  per  cent  confidence  level  I  he  function  I/jujH  is  plotted  in  I  ig.  lib  The  value 
F  =  3  II  corresponds  to  the  per  cent  confidence  level.  F  =  4.46  is  the  05  per  cent  confidence  level,  and 
F  =  K  65  is  the  per  cent  level.  1  he  true  positions  of  the  input  singlets  are  marked. 


amplitude  to  he  deteetahle  m  the  plot  of  t Jjc  /-'test.  Sole  also  t hat  the  /--test  has  peak.*  at 
frequency  values  not  associated  with  „.S';  singlets.  These  are  caused  by  random  statistical 
fluctuations.  The  frequency  band  shown  contains  to  independent  frequency  samples.  There¬ 
fore.  one  would  expect  that  due  to  randomness,  roughly  three  values  of  the  F- test  in 
Fig.  I  2  would  protrude  above  F *  4.5.  the  45  per  cent  confidence  level  for  the  ^distribution, 

We  also  took  540  hr  of  vertical  IDA  gravimeter  data  from  station  NNA.  starting  8.5  hr 
after  the  onset  of  the  Sumbawa  event.  This  record  is  relatively  complete,  with  only  two  data 
gaps  of  roughly  2.5  hr  each  at  45  and  275  hr  into  the  record.  Time  series  points  falling  in  the 
gaps  were  assigned  the  value  zero.  The  data  were  sampled  at  20  s  intervals.  We  low-pass 
filtered  and  decimated  the  record  so  that  it  contained  7668  points  taken  at  160  s  intervals, 
Aftershocks  that  did  not  visibly  affect  the  instrument  in  a  non  linear  manner  were  retained, 
as  their  effect  on  the  spectrum  ttr  the  vicinity  of  0S;  is  small.  Sections  exhibiting  non-linear 
seismometer  response  contribute  significant  energy  ar  low  frequencies,  and  so  these  were 
removed 

We  had  to  know  roughly  the  Q' s  of  the  singlets  of  ().S;  to  apply  our  procedure.  The 
Q  -  560  value  for  given  by  the  model  of  Masters  &  Gilbert  (1485)  corresponds  to 
d=  0.68.  Chao  &  Gilbert  ( 1 480)  estimate  that  the  m  =  2  singlet  of  „S;  has  a  Q  of  415.  the 
m  =  0  singlet  has  a  Q  ot  604  and  the  m  =  2  singlet  has  a  Q  of  504.  The  Q  measurement 
reported  by  Hansen  &  Schnapp  ( 1482)  leads  to  a  decay  parameter  of  0  =  0.84.  We  analysed 
the  record  with  a  set  of  five  eigentapers  having  parameters  J2.V  =  Srr.  v  =  0  01  and  j}  =  0.65. 

The  function  I  p ( uO  I  obtained  using  the  eigentapers  is  plotted  in  Fig.  I  5h.  and  the  ampli¬ 
tude  ot  the  spectral  estimate  using  the  cosine  taper  l.iy(u>)|  is  presented  in  Fig.  15a.  Again, 
we  graph  the  trequency  band  280-  /<  540 /uH z.  Spectra  were  calculated  at  frequencies 
separated  by  0.!65pHz  using  the  DFT  Table  6  lists  the  frequency  estimates  of  the  five 
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Figure  13.  fj)  Amplitude  of  ;i  spectral  estimate  using  a  cosine  taper  tor  a  time  series  ot  the  Stimbaua 
event  recorded  at  IDA  station  NN A.  We  plot  the  (requeue)  hand  280</  •  34* ' ju  11/  containing  the  five 
„S:  singlets  On!\  two  ot  the  singlets  are  observable,  (hi  Amplitude  of  the  tun*. t'oti  u  (u;)  tor  a  Mine  series 
of  the  Sum  haw  a  event  recorded  at  IDA  station  NVV  We  plot  the  trequenev  hand  2S0  <  r  <  340  u  1 1/ 
containing  the  live  „.S\  singlets,  but  again  onlx  two  singlets  are  visible  1  lie  positions  ot  the  tive  singlets  as 
determined  In  staekine  are  indicated 


singlets  ot'  o S2  made  by  Buland  el  al  (  l1'-’*-));  these  frequencies  are  marked  in  Fig.  1  3a  and  h. 
Only  the  m  =  t  2  lines  are  clearly  visible  in  Fig.  13a  and  b  Candidates  tor  the  other  singlet 
resonances  are  evident  but  do  not  protrude  significantly  above  tile  apparent  ambient  noise 
level. 

Figure  1 4  is  a  graph  of  the  F-test  ot  the  lit  of  p  (co)  to  the  complex  eigenspectra.  There  are 
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-test  of  fit  of  Estimated 


Amplitude  to  Eigenspectra 
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Figure  14.  /••  test  f»»r  the  estimated  amplitude  m  tea)  plotted  in  Fig.  13a.  The  time  series  being  analysed  is  a 
record  ot  Sumbjvvj  event  from  IDA  station  SNA.  We  plot  the  frequency  band  2S0  <  J  <  340  u  11/ 
containing  the  tive  (I.S\  singlets,  l  our  of  the  singlets  have  / -lest  peaks  corresponding  to  greater  than  95  per 
cent  confidence  of  detection.  The  positions  of  the  five  singlets  as  determined  In  stacking  are  indicated 


tour  peak*  above  the  l>5  per  cent  detection  confidence  level  in  Fig.  14  which  correspond  to 
singlets  of  (l5:.  The  m  =  1  singlet  appeals  to  be  contaminated  by  noise,  resulting  in  a  low. 

asymmetric  variance-ratio  peak.  The  estimated  frequencies  of  all  five  lines,  and  their  asso¬ 
ciated  uncertainties,  are  listed  in  Table  (>.  The  discrepancy  between  the  m  =  I  frequency 
estimate  and  that  of  Buland  cl  ul.  (  is  another  indicator  ol  t he  noise  contamination  of 
the  m  =—  I  singlet.  The  other  peaks  in  Hg.  14  above  the  *•)>  pei  cent  confidence  level  are  most 
likely  due  to  random  tluctuations. 

In  the  above  examples,  we  knew  (approximately  )  the  frequencies  of  the  decaying  oscilla¬ 
tions  and  that  they  had  large  enough  amplitude  to  be  detectable.  To  be  useful,  the  multi¬ 
taper  detection  algorithm  lor  decay  ing  sinusoids  should  yield  comparable  results  when  either 
or  both  of  the  above  conditions  are  not  satisfied,  (liven  the  known  frequencies  of  the  gravest 
seismic  oscillations,  one  could  use  the  algorithm  to  search  for  so-called  'silent'  events  (e.g. 
Kanamori  &  Cipar  1 1>“’4 ).  whose  existence  is  still  controversial  In  the  more  conservative 
enterprise  of  expanding  and  refining  t he  free-oscillation  data  sei  in  order  to  constrain  deep 
Earth  structure  more  reliably,  the  eigentapei  algorithm  otters  hope  of  retrieving  more 


Table  6.  frequencies  of  „.S\  in  \N  A  record  ol  SumKiw.i  eveni 


Singlet  Azimuthal  Order  m 

F  -Tcsl  Results 

t 

0 

1 

i 

Frequency  (mHz.) 

29988 

70S  26 

40918 

.31423 

.71870 

Frequency  Uncertainly 

.00027 

.00060 

noon 

.00016 

.00017 

Phase 

1 74" 

-48'’ 

-47“ 

142° 

10° 

F  -  Value 

71  7 

4.2 

so 

30.0 

20  S 

From  Buland  rt  til.  ()_979> 

Frequency  (mHz) 

tooo 10 

304799 

709490 

714000 

.718499 
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marginally  observable  modes  than  are  accessible  using  single-taper  algorithms.  Care  must  be 
exercised  that  peaks  in  the  A-test  due  to  random  noise  are  not  misidentilied  as  seismic  tree- 
oscillations.  To  this  end.  quantitative  comparison  of  more  than  one  seismic  record  is 
essential.  This  has  been  done  by  combining  the  standard  techniques  of  stacking  and  stripping 
of  low -frequency  seismic  records  (Gilbert  it  D/tewonski  ll>75)\\ith  the  multitaper  algorithm. 
Tilts  is  discussed  in  Part  II  of  this  paper. 

5  Summary 

We  have  described  a  vanational  procedure  for  determining  tapers  that  optimally  resist 
spectral  leakage  from  outside  a  frequency  region  of  bandwidth  lil  for  exponentially  decay¬ 
ing  sinusoids  contaminated  by  white  noise.  Multiplying  the  data  by  these  tapers  creates  a 
number  of  time  senes.  A  decaying  sinusoid  model  is  fitted  to  the  discrete  I  ourier  Hailstorms 
of  the  tapered  data  series  at  each  frequency  of  interest  (equation  3.15).  1  he  fit  of  this  model 
to  the  data  is  tested  at  each  Irequency  using  a  statistical /--test  (equation  3.2M.  This  gives  a 
quantitative  measure  of  the  chance  that  there  is  a  decaving  sinusoid  at  an\  given  frequency 
m  the  data.  We  have  shown  that  this  procedure  is  a  sensitive  detectoi  of  decaying  harmonic 
lines  in  free  oscillation  data. 

In  Part  II  of  this  papet.  we  shall  present  a  number  of  extensions  to  the  multtple-tapet 
method  of  harmonic  analysts.  We  shall  explain  Itow  the  technique  lias  been  modified  to 
estimate  the  harmonic  components  of  records  containing  gaps.  We  discuss  how  sinusoids  at 
frequencies  between  the  disciete  TFT  bin  frequencies  can  be  detected,  and  how  this  method 
can  be  combined  with  conventional  multi-station  stacking  procedutes.  I  he  resolution  of 
closely  spaced  harmonic  lines  is  treated.  Subsequently,  we  plan  to  introduce  algorithms  for 
finding  the  decay  rates  of  free  oscillations,  as  well  as  their  frequencies. 
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Appendix:  error  estimation 

The  methods  ot  Section  5  can  he  used  to  obtain  estimates  ot  the  complex  amplitude  and  the 
frequency  of  a  decaying  sinusoid  in  a  time  series.  Random  noise  can  cause  the  estimated 
amplitude  and  estimated  frequency  to  deviate  from  the  true  values  This  appendix  outlines 
methods  tor  calculating  the  expected  sue  ot  these  deviations. 


Al  Estimated  amplitude 

First,  consider  the  estimated  amplitude  pic j).  It  is  a  statistical  estimatoi  of  the  true  ampli¬ 
tude  p.  The  utility  of  p  as  an  estimator  can  he  gauged  by  its  bias  (p)  p  and  Us  mean  square 
error  (|/a  p\2).  Let  the  data  ,v(rl  be  zero  mean  white  noise  «(/)  plus  a  decaying  sinusoid 

with  frequency  ce  r.  Then 

x{t)  = «(/)  +  p  exp  an.  r  =  0.  1 . V  1  (All 

where  p  is  the  true  complex  amplitude,  a  is  the  true  decay  rate,  and  <«(/)«*(/  l>=  o\ 6„  . 
We  also  assume  that  </i ( /) /i ( r* ))  =  0 .  | This  is  justified  as  only  the  real  pan  of  the  n(t)  is 
actually  measured,  leaving  us  free  to  define  its  imaginary  part  Miller  (hitq.  p.  41)  gives 
further  details.)  The  A'th  transformed  complex  eigenspectrum  of  the  data  is 

zk 4  ce )  -  ( G  1  i/ciXi  ( cu  I 

=  gk(u>)  a  pVk(pj  coT:j3.p).  (A.2| 

where 

a  - 1 

gk  leu)  =  (G_l  )ki  V  exp(  iwM*v,(/;j3.p)/i(M 
r~o 

and  G.yvlvu)  and  Vk  are  as  defined  in  Section  .V  It  follows  that 

(zk( w)>  =  juffclw  -  wx’.(3.p)  (A.3| 

anti 

(zk  I  a)  i  Zj  I  cel)  =  i7^-  f>k[  +  |  p  |*  Vk  ( aj  cu  j .  ,1 .  c )  V*{  cu  coy  ■  P-  v  )•  ( A. 5  I 

The  expected  value  of  p.  combining  (3.131  and  (A. 3)  is 
k  t 

P  V  (0; (3.  p)  F*tcu  caT:ll.it) 
k  o 

K  i 

v  I  F*l().  (I-  P)l3 

k  -  0 


iptrcj))  = 


<  A.G) 
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When  cj  =  cjt.  (p(tuT )>  =  p.  so  p  is  unbiased  at  the  true  frequency  coT.  At  other  fre¬ 
quencies  p  is  a  biased  estimator  of  p. 

The  mean  square  error  of  p  is  constant  at  all  frequencies,  and  using  (A. 6).  is 
a  v 

<|p  p  |2>  =  .  (A. 7) 

P 

where 
K  t 

p=  V  |  F*(0;ji.i')!2 

k  U 

A2  Frequency  estimates 

Now  consider  the  estimation  of  the  true  frequency  coT  The  true  frequency  can  he  estimated 
from  |l)  the  frequencies  of  peaks  in  the  modulus  of  the  estimated  amplitude  I  pi2.  (2) 
minima  in  the  unexplained  sample  variance  i^(u>;)  introduced  in  (7.25).  or  (7)  peaks  in  the 
random  variable  /•'(u>;)  =  (A  l)0(ou;)  These  all  provide  approximately  unbiased 

estimates  of  the  true  frequency  cuT.  and  their  mean  square  errors  can  be  computed,  as 
shown  below  . 


A2.I  t  hi  oi  i  Nr  it: s  lstim  ati  n  from  i’i  aks  in  |p|2 
The  function  I  p(w)  I2  achieves  a  peak  at  frequency  u>f> .  where 


0  =  I  P  <w)  I2  '  )  P 

\  du)  i  ui  =  w# ' 

In  a  neighbourhood  of  the  true  frequency  coT 

0  =  8'{u>rt  )  =»0'(cuT  I  +  ( wo  a>T)0' (cop). 

Taking  expectation  values  of  both  sides,  and  assuming  that 
uncorrelated: 


( A.8 ) 


(A.d) 

uiT )  and  0  "(cuT)  are 


<0’(ojt)> 

<cj«  cj-r >  «  -  - .  ( A. 1 0) 

«?  («T» 

Define  the  matrix  M<y  l  with  elements 

!)':  m.  I  =  0.1.2 . V  1.  ( A.  1  1 ) 


where /  is  an  integer,  and  the  vector  v  with  elements 


K  -  I 

V%(0 :0.v)ukit:Q.v) 

k  0 


bU)~  exp(  ar). 

P 

t  =  0.  1.2.. 

...A'  1. 

(A. 12) 

Then  some  algebra  shows  that 

<0’(Cl>t)>=  /'Ip |2pv  •  Mu>  •  v  =  0 

(A.  17) 

as  M(l  *  is  antisymmetric,  and 
(0”(coT))=  p  |  p  I2  v  •  M(2)  •  v. 

(A  14) 
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Therefore,  from  (A. 101.  <u>u  coT)  =  0  and  u>0  is  an  unbiased  estimator  of  cjt.  To  find  the 
mean  square  error  of  u>g .  square  both  sides  of  (AT))  and  take  expected  values: 


«Ul0 


CJX  I") 


<[0'(wt)]2> 
<  [0"(wT  )|2> 


(A. 15) 


Using  the  relation 

Or  (/j ).v*(/2)jr  (fj)x*  (/4)>  =  |q  I4  exp|/ooT(r,  -  t2  +  ts  tail  exp  |  a(r,  +  r2  +  r3  +  r4)| 

+  a;v|q  |25fi  f  ,  exp  f/coT  t/j  -  iA  I  a(/j  +  A,)| 

+  <5f)r4exp{twT(/, --  t2)~a(t,  +  t2H 

+  5r,r4  exp  [Kox<r3-  -  a(r3+  t2>] 

+  8rtj  exp  i/coTUi  -  f4)  -  Q(r ,  +  /4)j 

+  o%  {  1  +  ‘£6,^) (6,^,6,^  +  6f|fj  5r,r  )  (A.  1 6) 


one  finds  that 

<  (0’(ojt  l|2>  =  P  {<Tv  s  •  M'2)  •  s  +  2ojvip  l2s  •  r}  (A.  I  7) 


and 

<[0"(wr  ))2>  -  p  { Oy  s  ■  M(4)-  s  +  Zo.ylq  l2s  •  r  +  |ju|4(v  •  M(2)-  v)2}. 


where  s  has  components 

s,  =  (u(r)  exp  <ar)|2:  t  =  0.  I . N  -  1 

and  r  has  components 

r,  =  1(M(2)-  v),]2.  r  =  0.  I . A  —  1 . 

For  sufficiently  large  initial  signal-to-noise  ratios.  AT  q  |2/oy  >  1 .  and 


(( COtf  —  cox  )2)  « 


2a%  s  •  r 
TqT2  v  •  M(2)  ”v 


(A. 18) 


(A.  Id) 


(A.;o) 


(a.:d 


The  mean-square  error  of  tlte  estimator  u>g  decreases  as  the  signal-to-noise  ratio  increases. 
Figure  Alisa  plot  of  the  estimated  rms  misfit  of  cog .  defined  by 

<COfl  coT>rms  =  \A(wg  u>T)2) 

as  a  function  of  initial  signal-to-noise  ratio  for  tapers  with  parameters  0  =  O.b.  i  .05  and 
u=  0.0.01.  0.1.  1  using  (A. 15).  The  misfit  is  plotted  on  the  ordinate  as  a  fraction  of  the 
Rayleigh  frequency  =  2n/T.  where  T  =  A rAf.  The  parameter  ATq|2/ojv  >s  plotted  on  the 
abscissa.  One  expects  frequency  uncertainty  to  increase  rapidly  with  decreasing  signal-to- 
noise  ratio,  but  for  A'|q|2/'(Ty<  10.  the  estimated  frequency  uncertainty  in  Fig.  A1  is  essen¬ 
tially  constant.  This  is  because  relation  (A.I  5)  ceases  to  be  a  good  approximation  at  low 
signal-to-noise  ratios,  where  the  first-order  expansion  (A.0)  fails  to  hold,  and  (cog  -  coT)  and 
0”(cox)  are  correlated.  The  solid  curve  corresponds  to  v  =  0;  larger  values  of  r>  correspond  to 
succeedingly  finer  dashed  curves.  The  rms  misfit  (cog  wy),,,,  tends  to  decrease  with 
decreasing  values  of  v (except  for  (3  =  1 .05  and  AT  q  |2/ojv>  80). 
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Figure  Al  la.  In.  Estimated  rms  misfit  ot  estimated  frequency  wy  In  true  frequence  cap  as  a  function  of 
.V  ;m  f’.' Oy  for  tapers  with  1Z.V  =  Hit.  d  -  tt.6.  1.1*5  and  v  =  It.  It  01 .  (I. I  and  1  I  he  eurves  are  meaningless 
for  .V  u  '■  o'y  <  I  It  because  (A  I5i  tails  to  be  a  good  approximation.  I  nccrtaum  decreases  with  decreas¬ 
ing  values  of  noise  parameter  v  (except  tor  d~  1.05  and  A  imi:  o\  -•  KOi  Uncertainty  decreases  with 
increasing  sigrtal-to-noise  ratio. 


A2.2  i  RKyui  Nc  iKs  estimated  from  minima  in  yluil 

Another  estimator  of  the  true  frequency  uip  is  uiw .  t he  frequency  ot  a  minimum  in  t lie 
unexplained  sample  variance  defined  by 

)  =  0 


The  frequency  is  also  an  unbiased  estimator  of  wr.  as 


(cdy  COj)  *** 


{  t P  (  CJ-p  )) 
<I^"(WT)) 


(A.  22) 


Multiple-taper  spectral  analysis:  / 
The  result  (A. 22)  can  be  obtained  using  (3.25 )  and  ( A.l  3): 
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,  ,  *  ~  »  //  d 

(i//  (coy ))  -  (£  ( cj j ))  —  ^  yl  —  | r* ( co ) | 

=  n  '  V 


V  d  co 


^  uj  =  tJ  j-  ^ 


K  l  X  -  l  A’  -  l 

=  *ImI2  2  2  2  u  -  exp  (  a(f  +  f')|  =  0.  <A.23| 

*  = o  r^o  r=o 


Also. 


<^"(coT)>=  -  |/i|3rr(M(2)*r(1))>0, 
where  the  matrix  T(7>  has  components 
K  i 


r(/> . 

1  Tt 


V  {»»  (/)  >»■  it ’  l  \  V.  v  1' 

.  .  I  #C  ’  '  *  •  •  ’  /V  •  -  *  r  •  •  .  it! 


t.  t'  -  (J.  1 . V  I 


( A. 24 ) 


(A. 25) 


and  tr  denotes  the  trace  operation  on  matrices.  Define  also  the  matrix  with  com¬ 

ponents 

•If'1’  =  r‘‘ *  exp  (  /if)  exp  (  vt  f.  i,t-  0.1 . V  I.  (A.2h) 

Then  the  mean  square  error  of  estimator  co.v.  can  be  approximated  as 

<|v’(oaTl]:> 

^  ( <  O  ^  COy  )  )  ^  - - - - —  . 

([Cr  (Wr'l'r 

where 


( A. 27 ) 


(A  2b) 


<lv'(wT)|:>=o4vtr(M,J|  •  r<;>)  +  2ct-v  I  m  I'  V  '|,M<"-r|0-a|  |„|2 

t  -  0 

and 

\  i 

<1^’(wti12>=  04v  t r  ( M<4 1  •  r,2lt  +  2o:\  ipl2  V  |(Mu,-r|0-a|  l„|:+  I  tu  I4 1  tr  (M,:)  ■  r|Q'a|  )| 

~  0 

( A.21)) 

For  AT  n  1 2  Oy  ->  I  : 


,  (  o\\  1=0 

coT  )•>--(  )• 

\\H\2I  (tr 


2  V|(M<1,-r|,,-a|a2 


(,r(M,2)r|a'Q|)|; 


(A. 20) 


As  signal-to-noise  ratio  increases.  «wv  oj-p)2)  becomes  smaller.  Graphs  of  <wv,  a)]  >ntls  = 
x/T(u)v  ccr)2>  have  the  same  shape  as  the  plots  of  (u^  u>p>rms  in  Fig.  Al.  but 
(cov  tOjTrms  ls  10  25  per  cent  larger  than  <cj>;  toT>rms  for  a  given  signal-to-noise  ratio. 
For  example,  if  (3  =  v  =  0  and  .V |  p  |2  o\  =  10.  <w„  -  tar)ms  =  O.I2oojr.  and  <co.v.  uv)rms 
-  0. 1  h5  wR . 


A2.5  i  Rt  uni: Mi  nts  estimated  erom  peaks  in  Flw) 

The  true  frequency  ooT  can  also  he  estimated  from  the  frequencies  ut|.  of  peaks  m  the  F-test 
curve  detined  by  F  (W|  I  =  0.  As  before,  to  first-order  in  ( wj.  cop  ).  assuming  (wp  utp- ) 
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and  f  "(cor )  are  uncorrelated: 

(f"(uT  )> 

(cut  w  r>  •<=  - . 

(F  (wT» 

By  (3.77). 

^/2(c o)F'(<jo)  =  (K  1 )  \8‘ (cj)  (to)  6  (to)  i^’(co)) 


(AM) 


( A .  3  2 ) 


Assume  that  i^/2  and  F  .  O'  and  \j/.  and  0  and  y'  are  uncorrelated  at  coT.  Then .  the  expectation 
value  of  the  right  hand  side  of  ( A. 32)  vanishes  at  to  =  wT.  But. 


<\ir(Gj-r )) 


=  o4\-(  S 

'  i  o  ' 


7=0 


«o  (F' { c-_'~  ))  =  0  As  t F" l/.i— ))  -r  0.  <o.  is  an  unhiir  prl  estimator  of  <o  r  by  ( A. 3  I  ). 

The  frequency  cjt-  can  be  expressed  in  terms  of  to8  and  uy.  Expanding  0(u>j.-l  and 
i^(oji  )  in  power  series  about  their  extrema  to  second-order  in  (coF  -  oj«  )  and  (ooF  co^, ): 

(K  -  I )  |0  (cjfl  )  +  'c  (ooF  u)g  )20  (oo^)] 

-  -  (A.33) 

i'(w^)  + 'b(u)F-cc0)  &  (tov) 

and 

0"(cj,j)  (uf  -  cov  )/-’(coF) 

(u>y  -  u>y)  — - = -  (A.34) 

V <co„|  K  I 


from  (A.33).  Substitute  (A.33)  in  (A.34)  and  let 

(CJV  Ijjy)  (cOo+CJy)  (A. 35) 

to  y  =  do  —  •  + - 


so  that  toF  =  u)w  when  to  =  I .  and  cot  =  cort  when  to  =  1 .  Then  to  satisfies 

do2  2  (a  +  b)  do  ( I  +  2b  2a)  =  0.  (A. 3b) 

where 

20<OJy) 

a - - - 

(  to^  <jjy )  6  ( ) 

2y(coJ 

(<ou  -  wo)2tii"(ujc. ) 

The  two  solutions  of  the  quadratic  equation  ( A.3hl  are 

<ot  =  (a  +  b)  ±  \/(a~+  bj*  +  2b  +  I  2a.  (A. 37) 


The  solution  do*  is  spurious  because  |  do*|  -»  °°  as  a  or  b  -  ■».  and  truncated  Taylor  series 
expansions  in  (A.33)  and  (A.34)  are  invalid  for  large  values  of  to.  The  second  solution  cb_  is 
constrained  so  that  |  do_  I  <  1 ,  and  corresponds  to  ooF  lying  between  cos  and  tov  . 

As  coF  is  a  weighted  average  of  cofl  and  tov  .  one  might  expect  coF  to  be  a  more  accurate 
estimator  of  the  true  frequency  coT.  This  hope  is  dampened  when  one  realizes  that  the 
deviations  of  u>o  and  co^  from  the  true  frequency  coT  are  strongly  positively  correlated. 


1 


Multiple-taper  spectral  analysis:  / 

The  correlation  between  cj()  and  can  be  estimated  as 

<1 {/  (cj-|-  )f)’(cor  l> 

UJr)(uJ.  UJT)>^  „  - - - . 

<d/  (i i/j)0  t cj|  )> 

where 

<^/’tu)|  )  0  (u)  i  )>  -  p  {a4-  ti  (M<:)  •  T*11) 

,v  t 

+  a  v  |  p  I2  V  (M")-v),(M(l)- r|oa|l„i>(nexp(  atl| 
o 

and 

<v"uotmmu>j-|>  =  P  (ct.v  tr(M'"  •  r-l 

,  tv  t 

+  2o\\p\-  V  (Mt21  •  vl,(M(21  •  r|0'a|  )n0(t)  exp  (  a/I 

r  -  0 

+  |p|4(v-  M<2)-v)tr(Af<2)-  r|Q'a|  )}.  ( A. 40 1 

where  M</)  has  elements 

St#  =  Mftvik)  exp  (aA-)u(/)  exp  (a/).  k,  1  =  0.  1.2 . V  I 

Using  these  equations  one  finds  that  the  cross-correlation  of  u)«  and  u>v  is  almost  unity. 

For  example,  if  p  =  v  =  0  and  A’l^l2  a2v  =  10.  «cj«  wTMu>^.  u>T  l>  =  0 . 2 2 .  Any  aveiag- 

ing  of  the  two  estimators  w,)  and  u>v  will  not  result  tti  an  estimator  which  has  significantly 

less  error  associated  with  it. 

Using  (A. 34).  one  can  see  that  tor  large  values  of /'.<*>]  *»  Therefore,  we  estimate  the 
errors  in  the  frequencies  of  the  /'-test  peaks  using  equation  ( A .27  >. 
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(A.38) 


A3  Detection  sensitivity 

It  is  useful  to  know  the  sensitivity  of  the  /'• test  to  the  presence  of  a  decaying  sinusoid  of 
frequency  coT  The  signai-to-noise  ratio  required  for  detection  of  a  sinusoid  at  a  given  con¬ 
fidence  level  can  be  calculated.  Suppose  that  the  time  series  is  given  by  (All.  with  p  either 
purely  real  or  purely  imaginary.  It  can  be  shown  that  at  frequency  u>T  the  random  variable 
/"defined  in  (3.28)  follows  a  noncentral  /-'-distribution  with  noncentrality  parameter 

d|p|2* 

7=  -  1  I  U*(0: 0.  n)l2 

<Tv  k  o 

( Kendall  &  Stuart  1 979).  The  expected  value  of  F (coy )  is 

(A. 42) 

(Kendall  &  Stuart  |97o,  p.  279). 


</(u U 


(2  +  y)(K  1) 
2  (A  2) 


t  A  41  ) 

i 


In  Fig.  A2.  (Ffui  r  )>  is  plotted  as  a  function  of  signal-to-noise  tatio  for  sets  of  five  tapers 
with  parameters  Hit.  0  =  0.6.  1 .05  and  v  =  0.  0.01 . 0.1 .  and  1 .  For  a  given  signal-to- 

noise  ratio,  the  expected  value  of  the  /'-test  grows  with  increasing  v.  reaching  a  maximum 
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r  test  expectation  vs.  nit. o'  s<gno- -to--noise  power  ratio 


Figure  \2  (a.bl.  t:\pecied  value  i’l  the  /--test  jt  the  true  frequency  uj  as  a  function  of  ,V  I  plVoy  for 
five  tapers  With  Sl.V  =  8rr.  p  =  0.6  ami  t i  =  1 .05.  and  v  -  0.  0.01 .  0. 1  and  I .  Larger  values  of  u  are  plotted 
with  mcrejvinglv  shorier  dashes.  The  99  per  cent  confidence  level  for  an  /••distributed  random  variable 
with  2  and  8  degrees  of  freedom  (/■'  -  8.65)  is  shown  1  or  a  given  iniiial  signal-tu-noise  ratio.  </■’>  increases 
as  n  increases.  Therefore,  n  is  easier  to  detect  a  decaying  sinusoid  using  tapers  designed  with  large  values 
of  the  noise  parameter  v. 


when  o  =  l.  Suppose  one  wants  to  detect  a  decaying  sinusoid  at  the  99  per  cent  confidence 
level.  To  do  this  using  tapers  which  have  0  =  0.6  and  v-  0  requires  a  25  per  cent  higher 
signal-to-noise  ratio  than  performing  the  analysis  with  tapers  which  have  parameters  0  =  0.6 
and  v  =  0.1 .  Using  tapers  with  0  =  1 .05  and  v  -  0.  a  125  per  cent  larger  value  of  |  p  1 2/ojv  is 
required  than  employing  tapers  designed  with  0=1.05  and  p=0.1.  There  is  a  tradeoff 
between  detection  capability  (Fig.  A2)  and  frequency  uncertainty  (Fig.  Al).  but  tapers 
designed  with  0.01  S  t'S  1  provide  reasonable  performance  in  both  areas. 

For  comparison,  consider  the  spectral  estimate  obtained  by  taking  the  discrete  Fourier 
transform  of  a  tunc  series  which  has  been  multiplied  by  a  cosine  taper.  A  cosine  taper 


Multiple-taper  spectral  analysis:  l 
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u'c(M  is  defined  by 

r  /  2 nt 

vt'  (cl  =  A  I  I  cos  I - 

l  \N  1 

where  A  is  chosen  so  that 

x  -  i 


v  (wc(M|2  =  1. 


/  =  0.  1 . /V  I. 


I  o 

A  direct  estimate  of  the  spectrum  of  the  data  \(t)  is  I  rc(w)  I2,  where 
v  t 


yc(w)=  ^exp(  iut)wcU)x  U). 

I  o 

The  peak  frequency  ojc  defined  by 


( A. 43 ) 


(A. 44) 


—  |ycfwHJ 

d(jj 


( A  .45  > 


is  taken  as  the  estimator  of  t he  true  frequency  cuT  of  a  sinusoidal  signal  in  the  data.  As 
before.  u>c  is  an  unbiased  estimator  of  coy.  and  it  has  mean-square  error 

((  cjc  U)j  )2)  - —  - - r~  _  (A.4(>) 

O'-1'-'111)  l_t> 

Expressions  for  the  expectations  on  the  tight-hand  side  of  (A. 46)  are  identical  to  ( A .  1  7 )  and 
( A  I X)  with  nt  1 1 1  exp  (  at)  replacing  t)(M.  and  p  =  I . 

For  data  consisting  only  of  Gaussian  white  noise.  2  |yt(u>)l2  a2  is  y2  distribuied  with  two 
degrees  of  freedom,  and  there  is  a  probability  ot'0.01  that  2  I yc(co)|2  a2- will  reach  or  exceed 
9.21  <  Abramowit/  A  Stegun  lOtoi.  If  |vctcu)|2  exceeds  the  value  9.21  o2v  2  at  some 
frequency,  then  one  is  more  than  90  pci  cent  confident  that  a  signal  exists  at  that  frequency. 
It  is  ease  to  show  that,  for  the  time  series  (  A.  1 1. 

<  lyc(u>j  1 12>  =  I  p  i2  |  ^  uytnexpi  anj  +  o2y  (A. 47) 

so  that 

■V|q|2  a\  =  3 .6  A’  (  V  vvc(/)exp(  at  ij  (A.4S) 

•tti  ' 


is  the  value  ol  the  initial  signal-to-noise  power  rat isr  associated  with  90  per  cent  detection 
confidence  at  frequency  coT. 

Suppose  one  wants  to  detect  a  decaying  sinusoid  with  decay  parameter  0  =  0.6  (or  decay- 
rate  a  =  0.6tr/ri  at  the  00  per  cent  confidence  level.  Using  the  spectral  estimate 
|yc(w)|2.  a  value  of.Y|q|2  a\  of  approximately  42  is  required,  whereas  using  an  F-test  and 
five  tapers  for  0  =  0.6.  v  =  0.1.  ,V |  p  |2  «=  23.5  corresponds  to  detection  at  the  00  per  cent 

confidence  leVel  If  the  decaying  sinusoid  has  a  decay  parameter  of  0  =  1 .05.  an  initial  signal- 
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to-noise  ratio  ot'  ,V  |  /a  \2l'o\  ~  104  is  needed  for  99  per  cent  confidence  level  detection  using 
the  spectral  estimate  |j'c(oj)|J.  but  N\n\2/o%  only  needs  to  be  38  when  the  multitapei 
method  is  applied,  using  five  tapers  with  (3=  1.05  and  =  0.1.  In  this  case,  the  multitaper 
approach  is  274  per  cent  more  efficient  then  the  cosine-taper  spectral  method. 

Clearly,  the  spectral  estimate  |y>cfco)(2  is  a  less  sensitive  detector  of  decaying  sinusoids  in 
a  time  series  than  the  multitaper  method.  Much  of  this  discrepancy  in  detection  ability  is 
due  to  the  eigentaper’s  preferential  weighting  of  the  start  of  the  record  where  the  signal-to- 
noise  ratio  is  greater.  Also,  more  information  is  extracted  from  a  given  time-series  by  apply¬ 
ing  several  tapers;  the  extra  degrees  of  freedom  allow  a  better-constrained  least-squares  fit  of 
the  decaying  sinusoid  model  to  the  data.  Another  advantage  of  the  multiple-taper  technique 
is  that  it  allows  one  to  discriminate  between  signals  which  are  truly  harmonic,  and  those 


Wllicil  ildVC  ttUll 


spectra!  estimates  emptying  single  tapers  do 


not. 

The  variance  of  the  random  variable  F( u>T)  can  also  be  expressed  in  terms  of  the  non¬ 
centrality  parameter  y  defined  in  (A.41 ); 


var  [F(ojt  )]  = 


(A  1  )2  [4 ("y  +  1  ) ( A  -  l)  +  72| 
8  (A  2l2(A  3) 


( A.49 ) 


when  the  data  are  given  by  (A.l ).  and  gt  is  purely  real  or  purely  imaginary.  The  height  of  an 
F- test  peak  is  not  very  well  determined;  VvaTiFfcoT )]  >  V3 <F(u>x)>  when  A  =  5  for  values 
of  A  above  the  90  per  cent  detection  threshold. 
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Spectral  estimation  procedures  which  employ  several  prolate  spheroidal  sequences  as  tapers  have 
been  shown  to  yield  better  results  than  standard  single-taper  spectral  analysis  when  used  on  a 
variety  of  engineering  data  We  apply  the  adaptive  multituper  spectral  estimation  method  of  Thom¬ 
son  11982)  to  a  number  of  high-resolution  digital  seismic  records  and  compare  the  results  to  those 
obtained  using  standard  single-taper  spectral  estimates  Single-taper  snioothed-spectrum  estimates 
are  plagued  by  a  trade-off  between  the  variance  ol  the  estimate  and  the  bias  caused  by  spectral 
icaxa^s  Applying  a  taper  to  reduce  bias  di.-s.aius  daia.  increasing  the  variance  ol  me  estimate 
l  sing  a  taper  also  unevenly  samples  the  record  Throwing  out  data  from  the  ends  of  the  record 
can  result  in  a  spectral  estimate  which  does  not  adequately  represent  the  character  of  the  spectrum 
of  nonstationary  processes  like  seismic  waveforms.  For  example,  a  discrete  Fourier  transform  of 
an  untapered  record  ti.e  .  using  a  boxcar  taper)  produces  a  reasonable  spectral  estimate  of  the 
large-amplitude  portion  of  the  seismic  source  spectrum  but  cannot  he  trusted  to  provide  a  good 
estimate  of  the  high-frequency  roll-off  A  discrete  Fourier  transform  of  the  record  multiplied  by  a 
more  severe  taper  (like  the  Hann  taper)  which  is  resistant  to  spectral  leakage  leads  to  a  reliable  esti¬ 
mate  of  high-frequency  spectral  roll-off.  hut  this  estimate  weights  the  analysed  data  unequally 
Therefore  single-taper  estimators  which  are  less  affected  by  leakage  not  only  have  increased  vari 
ance  but  also  can  misrepresent  the  spectra  of  nonstationary  data  The  adaptive  multitaper  algo¬ 
rithm  automatically  adjusts  between  these  extremes  We  demonstrate  its  advantages  using  16-bil 
seismic  data  recorded  by  instruments  in  the  Ail/ a  Telemetered  Seismic  Network  We  also  present 
an  analysis  demonstrating  the  superiority  of  the  multitaper  algorithm  in  providing  low-variance 
spectral  estimates  with  good  leakage  resistance  wluch  d<>  not  overemphasize  the  central  portion  of 
the  record 


1  INIKOIH  (  1 1< in 

Spectral  estimation  is  a  powerful  method  of  data 
analysis  which  is  often  used  to  study  geophysical 
processes.  The  estimation  of  the  spectra  of  background 
noise,  line  components,  and  transient  signals  is  central  to 
the  analysis  of  electric,  magnetic,  and  seismic  time  series. 
There  have  been  many  techniques  developed  which  are 
effective  for  the  analysis  of  long  records  of  stationary 
processes.  Unfortunately.  these  techniques  are  not 
universally  applicable  to  seismic  data  sets  In  many  stu¬ 
dies  it  is  necessary  to  estimate  a  spectrum  from  a  short 
time  series.  This  situation  can  occur  if  some  of  the  data 
are  missing  or  if  the  data  of  interest  le  g.,  a  seismic  phase) 
are  contained  in  a  short  segment  of  a  longer  record 

A  new  approach  for  estimating  the  spectra  of  short  time 
series,  known  as  multitaper  spectral  analysis,  has  been 
developed  recently  We  have  applied  this  technique, 
which  was  first  presented  by  Thomson  [19X21.  to  several 
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dozen  seismograms:  in  this  paper  we  analyze  two 
representative  records.  The  spectra  estimated  using  the 
multitaper  technique  are  compared  with  several  direct 
spectral  estimates  employing  commonly  used  single  tapers. 
We  show  that  the  multiiaper  approach  can  yield  superior 
results  when  applied  to  high-frequency  seismic  data. 

Spectral  analysis  of  specific  phases  within  a  seismogram, 
particularly  those  at  regional  or  local  distances,  can  be 
difficult.  It  is  often  impossible  to  isolate  a  particular 
phase.  If  one  isolates  a  major  phase  by  discarding  the  rest 
of  the  record  and  then  makes  a  direct  estimate  of  the 
waveform's  spectrum  without  first  tapering  the  data  (i.e.. 
using  a  boxcar  taper),  the  high-frequency  roll-off  of  the 
estimated  spectrum  can  be  severely  biased  by  spectral 
leakage.  Therefore  it  is  standard  practice  to  multiply  the 
time  series  by  a  taper  before  performing  a  discrete  Fourier 
transform  (DFT)  to  reduce  spectral  leakage  (an  extensive 
review  of  tapering  is  provided  by  Hams  [  1 ) . 

The  cosine  or  Hann  taper  is  popular  in  seismic  analy  sis, 
being  both  effective  and  easy  to  calculate  The  utility  ol 
the  Hann  taper  is  bought  dearly,  however  if  one  views 
each  data  point  in  a  time  series  as  a  constraint  on  the 
estimated  frequency  content  of  the  record,  the  Hann  taper 
discards  5/8  of  the  statistical  information  in  a  given  time 
series.  This  can  be  easily  seen  from  the  graph  of  the 
Hann  taper  in  Figure  I  The  data  points  at  the  extremes 
of  the  record  are  weighted  weakly  ,  while  the  center  of  the 
time  series  is  emphasized.  This  unequal  weighting  causes 
the  statistical  variance  of  a  direct  spectral  estimate  using  a 
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Fig  I  Comparison  plot  of  boxcar.  Hann.  and  20'  -  cosine  tapers 


Ilann  taper  in  be  greater  than  the  variance  ot  a  periodo- 
grain  spectral  estimate. 

lilac ki i tail -Takes  i.ipciS  ltrc  designed  to  cllcumveiii  lilts 
loss  o!  inti imiation  somewhat  b>  applying  the  cosine 
weighting  to  only  the  extremes  of  the  record.  For 
instance,  the  21V  cosine  taper  (Figure  1)  discards  only 
12  5  ol  the  available  data  variance  constraints.  How¬ 
ever,  lh.nris  ||d~8l  shows  that  this  taper  has  less  resist¬ 
ance  to  spectial  leakage  than  a  Hann  taper.  As  long  as 
only  a  single  data  taper  .s  used,  there  will  be  a  trade-off 
between  the  resistance  to  spectral  leakage  and  the  variance 
oi  a  spectral  estimate 

/7’nmvn/i  i  i M s 2 '  introduced  the  multitaper  spectral 
analysis  technique  First,  the  data  are  multiplied  by  not 
one.  hut  several  leakage-resistant  tapers  This  yields 
several  tapered  lone  series  .’rum  >ne  record  Taking  the 
1)1  T c  ot  each  ’■!  'Peso  time  scries  severa.  cigenspectra 
are  produced  which  arc  combined  to  torm  a  single  spectral 
estimate 

The  tapers  are  constructed  mi  that  each  taper  samples 
the  time  series  ;n  a  different  manner  while  optimi/mg 
resistance  to  spectral  leakage  The  statistical  inlormation 
discarded  by  the  1 1 r c i  taper  is  partially  recovered  by  the 
second  taper,  the  inlormation  diss  irded  by  the  first  two 
tapers  is  partially  retrieved  bv  the  ihiril  taper,  and  so  on 
Only  a  tew  low-order  tc.pers  are  employed,  as  the  higher- 
order  tapers  allow  an  unacceptable  level  ol  spectral  leak¬ 
age  One  can  use  these  tapers  to  produce  an  estimate  that 
is  not  hampered  h>  the  trade-olT  between  leakage  and  vari¬ 
ance  that  plagues  single-taper  estimates,  as  we  will  demon¬ 
strate 

Single-taper  spectral  estimates  have  relatively  large  vari¬ 
ance  I  increasing  as  a  larger  Traction  ol  the  data  is  dis¬ 
carded  and  the  bias  oT  the  estimate  is  reduced*  and  are 
inconsistent  estimates  tie.  the  variance  ol  the  estimate 
does  not  drop  as  one  increases  the  number  ot  datat  To 
counteract  this,  it  is  conventional  to  smooth  the  single¬ 
taper  spe.tral  estimate  by  applving  a  moving  average  to 
the  estimate  This  reduces  the  variance  of  the  estimate  hut 
results  in  a  short-range  loss  ol  frequency  resolution  and 
therefore  an  increase  in  the  bias  of  the  estimate 

The  multitaper  spectral  estimates  are  formed  as  a 
weighted  sum  of  the  cigenspectra  Therefore  the  multi- 
taper  spectra!  estimate  is  already  a  smooth  estimate,  it  has 
less  variance  than  single-iaper  spectral  estimates  which 
have  been  designed  to  reduce  bias,  and  it  is  also  a  con¬ 
sistent  estimator  The  comparison  between  the  bias  and 
variance  properties  ot  the  single  taper  and  mulntaper  esti¬ 
mates  is  discussed  lurther  in  sections  5  and  4 


Another  difficulty  with  seismic  data  is  that  the  records 
are  nonstationary.  that  is.  the  statistical  character  of  the 
data  changes  with  position  in  the  record.  Therefore  a 
spectral  estimator  which  weights  the  data  in  the  center  of 
the  lime  series  more  heavily  than  data  at  the  ends  can 
overemphasize  the  signal  energy  in  the  middle  of  the 
record.  This  can  result  in  a  misrepresentation  of  the  spec¬ 
trum.  as  we  demonstrate  in  section  3.  The  multitaper  esti¬ 
mate.  which  discards  very  little  data  from  the  record  and 
weights  the  data  relatively  evenly,  is  not  subject  to  this 
problem. 

Section  2  presents  an  outline  of  the  basic  algorithm. 
This  outline  contains  sufficient  detail  to  allow  the  reader  to 
implement  the  algorithm  but  avoids  derivations  that  can 
be  found  elsewhere.  Section  3  describes  the  seismic  data 
used  in  this  study  and  presents  comparisons  of  the  spectra 
of  scjsm.'C  tjnic  series  iioncniod  by  both  remonid  und  Wh*;»I 
events.  We  demonstrate  the  trade-off  between  spectral 
leakage  resistance  and  variance  of  the  spectral  estimates 
produced  using  the  boxcar.  20  cosine,  and  cosine  tapers. 
We  compare  the  h.as  and  variance  of  these  conventional 
single-taper  direct  spectral  estimates  with  the  bias  and 
variance  of  the  mulntaper  spectral  estimates  in  section  4 
A  numerical  method  for  calculating  the  prolate  eigentapers 
is  given  in  the  appendix 

2  Tin  Ah  i  n  i  xi'i  k  At  t.< iki  I  iixt 

The  mulntaper  method  is  based  on  a  family  of  tapers 
which  arc  resistant  in  spectral  leakage.  We  outline  the 
mtd.’mipcr  method  h etc.  ami  note  th.it  more  detailed  treat¬ 
ments  sail  be  Inund  in  the  works  by  I'hoiiismi  11482]  atld 
/.  m.'/v/c  1 1 4X0] 

Suppose  that  we  are  given  the  finite  time  series 
v  ./  0.1.  .A  I  which  is  a  set  of  discrete  samples  of 
a  continuous  tunc  process  v  .0  /  •  A  11  (we  assume  a 
unit  sample  interval  -  I.  without  loss  of  generality  F  It 
\  has  no  harmonic  components,  then  it  has  the  Cramer 
spectral  representation  [Doob.  1 453] 

v  f  <  \<J  )dl 

AVc  wish  to  estimate  the  amplitude  spectrum 
.S  t  /  •  /.  \  i  /  • 1  (where  /:  denotes  expected  xaluet 

ol  the  continuous  time  process  \.  .()■  i  A  I!  from  the 
time  series  v  A  conventional  di.ecl  spectral  esti¬ 

mate  T  i/i  of  Vi  /  i  is  found  by  multiplying  the  data 
x  %  by  a  sequence  </  V.  called  a  taper,  applving  a 
OFT. 

v  o  i  X  “  v 

and  finally  taking  the  squared  modulus  of  the  resulting 
function  X  (/  i  Although  r  is  discrete.  /  is  continuous, 
with  /  (as  the  Nyquist  frequency  /  \  We 

normalize  the  taper  so  that 

i  «/.■•  i 

The  speural  leakage  properties  ot  the  data  taper  u  . 
i  0.1.2..  .A  I  can  he  deduced  from  its  OFT 

A 

I  ( /  i  £  1 1 1 
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For  convermonal  tapers  the  function  I.-)  |  has  a  broad  main 
lobe  and  a  succession  of  smaljer  side  lobes.  For  example, 
for  the  boxcar  taper,  a,  -  1  v  .V  and 


4n  prolate  tapers 


l  1  c 
x.V  1-f 


IX  ll:  '  sin A’-/ 
x  ,V  sin  rrf 


In  this  case  the  function  |.4  (./  )|  is  readily  observed  to 
have  a  central  lobe  flanked  by  smaller  side  peaks.  (The 
phase  factor  e  ’  results  from  choosing  the  time  ori¬ 

gin  i  =■■  0  to  coincide  with  the  first  data  point.  It  does  not 
affect  the  leakage  resistance  of  the  taper.)  The  larger  the 
side  lobes,  the  more  spectral  leakage  is  encountered  using 
this  taper,  biasing  the  estimate  Xa  away  from  its  desired 
value.  This  can  be  seen  by  observing  that 

*,(/')  =  J  A  V-f')X(TW'  (2) 

which  follows  from  substituting  the  Cramer  spectral 
representation  of.v,  in  the  definition  of  Xa  ,  and  therefore 

ll(/)|:  =  /  1-4  (/—/’) I 'S (/')<//" 

A  good  data  taper  should  have  a  spectral  window 
A  {f-f)  whose  amplitude  is  large  in  the  central  lobe 
region  where  is  small  and  has  low  side  lobes  at 

more  distant  frequencies.  This  reduces  the  bias  in  the 
estimate  by  preventing  the  energy  in  X  at  distant  frequen¬ 
cies  from  leaking  over  to  affect  the  estimate  |.V„.  I:  at  fre¬ 
quency  / . 

Suppose  we  wish  to  minimize  the  bias  at  frequency  / 
due  to  spectral  leakage  from  outside  the  frequency  band 
I/'-/ 1  <  W,  where  2  W  is  some  chosen  bandwidth.  We 
maximize  the  fraction  of  energy  of  .4  within  the  chosen 
band: 


A  (AMT)  = 


/  1.4  (/H'V/ 

a _ 

f  \A(f)\'df 


y(\\ 
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Fig.  2.  The  five  lowest-order  4-  prolaie  eigentapers  The 
zeroth-order  eigentaper  v101  is  plotted  with  a  solid  line,  and  the 
higher-order  tapers  are  plotted  with  dashed  lines 


We  seek  those  values  of  a  for  which  the  functional  A  is 
stationary.  This  leads  to  the  matrix  eigenvalue  problem 

C  ■  a  -  A(A'.W)a  =  0  (5) 

which  has  as  its  solutions  the  ordered  eigenvalues 
1  >  A„  >  A,  >  A;  >  ...  >  A\  |  >  0  and  associated 

eigenvectors  v“'(,V.H');  A  =0.1.2 . A'-l  (which  have 

components  r,"  r  =  0,1.2 . V  —  1 ) .  The  v“  M.V.  W)  are 


Since  no  finite  time  series  can  be  completely  band-limited. 
A(.V.W)  <  I  for  finite  A'  and  nontrivial  W.  The  func¬ 
tional  A  can  be  interpreted  as  follows:  in  a  single-taper 
direct  estimate  of  the  spectrum  of  a  white  noise  process  at 
frequency  /  .A  is  the  fraction  of  spectral  energy  in  that 
estimate  that  derives  from  the  frequency  interval 
I  W  .  1  —  A  is  the  fraction  of  spectral  energy  that 
leaks  in  from  outside  that  band. 

It  is  convenient  from  a  computational  viewpoint  to  sub¬ 
stitute  1 1 1  into  13)  to  express  A  in  terms  of  the  data  tapers 
'hemselves  rather  than  their  transforms.  If  we  seek  a 
aper  lor  an  A  point  time  series,  the  sequence  !u, ],'<>'  can 
represented  as  an  A'  vector  a.  This  notation  allows  us 
••vprexs  Hi  in  matrix  form  (following  the  derivation  of 
r-  si  of  Park  cr  al  11987],  letting  the  decay  rate 

iD.K  I  =  -  ~C  a  (4) 


eigenvectors  v'*'(A'.H  );  A  =0.1.2 . A'-l  (which  have 

components  r,"  r  =  0,1.2 . V  —  1 ) .  The  v“  '(.V.  W)  are 

discrete  prolate  spheroidal  sequences  \Slcpian,  1978], 
which  we  also  refer  to  as  prolate  eigentapers.  We  will 
suppress  the  explicit  dependence  of  v'*'  on  .V  and  W  in 
the  following  A  prolate  eigentaper  with  a  time-bandwidth 
product  of  P  =  S' W  is  called  a  Prr  prolate  taper;  it  concen¬ 
trates  spectral  energy  in  frequency  bands  of  width 
2W  «2 P  S  As  the  Rayleigh  frequency  1  A'  is  the  fast 
Fourier  transform  (FFT)  frequency  bin  spacing,  a  P-  pro¬ 
late  taper  will  have  a  main  lobe  which  is  IP  "frequency 
bins"  wide.  For  instance,  tapers  for  which  A  W  =4  mini¬ 
mize  the  spectral  leakage  at  frequency  /  from  outside  the 
frequency  band  defined  by  $  4  A'.  For  large  A 

(;>  100)  one  can  construct  a  set  of  the  v'“  for  any  value 
of  the  time-bandwidth  product  A'W.  As  noted  in  the 
appendix,  this  allows  the  user  to  calculate  one  set  of 
eigentapers  vui  for  a  fixed  value  of  A ’  and  to  interpolate 
this  set  to  construct  tapers  for  time  series  of  various 
lengths.  We  have  restricted  the  following  discussion  to  4- 
prolate  tapers,  but  similar  behavior  is  found  for  other 
choices  of  the  time-bandwidth  product. 

The  five  lowest-order  eigentapers  v,u.  A  =  0.1, 2.3.4 
shown  in  Figure  2  have  been  made  for  a  time  series  of 
length  A'  =  128  and  time-bandwidth  product  A'W' =  4 
The  lowest-order  taper  (A  =  0)  is  the  familiar  4-  prolate 
taper  advocated  by  Thomson  [1971.  1977u.h]  and  Eberhanl 
11973]  and  has  a  shape  similar  to  conventional  tapers  such 
as  the  cosine  taper  (Figure  1).  The  higher-order  eigen¬ 
tapers  are  markedly  different  from  ordinary  data  tapers 
For  even  values  of  A.  the  v1*1  are  symmetric  about  the 
midpoint  of  the  time  series.  For  odd  values  of  A  .  the  v  " 
are  antisymmetric  about  the  midpoint.  All  the  taperv 
except  the  lowest-order  one.  have  regions  of  positive  and 
negative  data  weighting  We  normalize  the  tapers  so  that 
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As  the  eigentapers  v'“  are  solutions  to  (?).  thev  arc 
orthogonal 
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4t r  prolate  taper  transforms 


Fig.  3.  Fourier  transform  amplitudes  of  the  the  4-  prolate  tapers 
shown  in  Figure  2.  using  the  same  conventions  for  dashed  and 
solid  lines. 


yl*1  .  y  U  1  _  (  6  ) 

(This  can  be  clearly  seen  in  Figure  2.)  This  relation  shows 
that  each  v'K'  can  be  used  to  provide  an  orthogonal  sample 
of  the  data  !.v, !,'  o'. 

Taking  discrete  Fourier  transforms  of  the  prolate  eigen- 
tapers  produces  the  spectral  windows 

V  I 

Uk(NW)**tk  X  •'  <71 

t  It 

where  we  have  used  the  time-centered  transform  to  elim¬ 
inate  spurious  phase  factors  in  the  definition.  The  func¬ 
tion  tk  =  1  if  k  is  even:  tk  =  i  if  k  is  odd.  The  use  of  t, 
is  a  notational  convention  so  that  Uk  is  real-valued.  Plots 
of  the  Uk  for  N=  128  and  .VfF-4  appear  in  Figure  3  for 
k  =  0, 1 . 4.  Most  of  the  energy  of  the  Uk  is  concen¬ 

trated  within  the  specified  frequency  band  as  was  required 
by  maximizing  (3).  The  spectral  windows  corresponding 
to  the  lowest-order  eigentapers  have  impressively  small 
side  lobes,  but  spectral  leakage  resistance  becomes  pro¬ 
gressively  poorer  as  the  order  of  the  taper  increases.  The 
lowest-order  2 NW  eigentapers  (e  g.,  the  eight  lowest-order 
4tt  prolate  tapers)  have  eigenvalues  hk  close  enough  to 
unity  that  they  are  useful  for  minimizing  spectral  leakage. 
The  eigenvalues  A*  of  the  eight  lowest-order  eigentapers 
with  time-bandwidth  products  4.  3,  and  2  are  given  in 
Table  1  for  N  =  128.  For  reference,  the  value  of  the  func¬ 
tional  (3)  is  given  for  a  boxcar  taper  which  concentrates 
spectral  energy  within  frequency  bands  of  the  same  width. 

To  construct  a  multitaper  spectral  estimate,  one  first  cal¬ 
culates  the  complex  "eigencoefficients1’  vk  (J  )  by  taking  a 
DFT  of  the  product  of  the  data  with  each  !  v, o' 

\  i 

yk(f)  =  X  v,'k'x,e‘2~"  (8) 

l  fl 


An  estimate  of  the  spectrum  can  be  made  from  weighted 
sums  of  the  eigenspectra  lyj2.  Thomson  (1982)  formu¬ 
lates  the  problem  of  estimating  the  spectrum  of  a  record 
as  an  integral  equation.  The  solution  of  the  integral  equa¬ 
tion  is  averaged  over  (f-W.f-  If)  to  produce  the 
smoothed  high-resolution  spectral  estimate 


S(J  )  --  K  '  X'  'M/  >l;  (9) 

k  11 

where  K  is  the  number  of  tapers  used  If  K  is  not  large, 
the  smoothed  high-resolution  estimate  (9)  differs  little 
from  an  arithmetic  average  of  the  eigenspectra  as  A,  -  1 
for  the  lowest-order  eigentapers. 

Although  straightforward,  (9)  is  not  the  best  multiiaper 
spectral  estimate  to  use.  An  adaptive  spectral  estimate 

X  k  (/>_>’*  </)i: 

S(/)=^r-1 -  (10) 

X  k</>i: 
k  1 1 


can  be  devised  which  has  frequency-dependent  weights 
<ik  (J )  chosen  to  reduce  bias  from  spectral  leakage  [ Thom¬ 
son,  1982).  This  technique  proves  extremely  useful  in  the 
analysis  of  highly-colored  spectral  processes.  At  frequen¬ 
cies  /  where  the  spectrum  is  reasonably  flat,  the  weights 
dk  (J )  =  1,  reducing  the  variance  of  the  spectral  esti¬ 
mates.  At  frequencies  /  where  spectrum  has  a  steep 
slope,  the  contribution  from  the  higher-order  eigentapers. 
which  have  poorer  leakage  resistance,  is  reduced.  The 
trade-off  between  spectral  leakage  and  variance  of  the 
spectral  estimate  is  balanced  at  each  frequency. 

The  optimal  weights  dk  can  be  found  by  minimizing  the 
misfit  of  the  estimated  spectrum  to  the  true  spectrum 
S(J).  This  misfit,  although  unknown,  can  be  estimated 
statistically.  The  resulting  equation  for  the  weight  </,(/) 
is 


<4  </)  = 


x/aTsi/) 

a <S(jl  +  ElBk  </)! 


(11) 


where  S </ 1  is  the  true  value  of  the  spectrum  at  frequency 
/  and  Bk  </ )  is  the  spectral  energy  at  frequency  /  that 
leaks  in  from  outside  the  frequency  band  (./ '--  W.f W). 
We  replace  the  unknown  value  5  (./  )  by  its  estimate  S(J  ). 
Thomson  [19821  found  it  adequate  to  approximate 
E\Bk  (./’)]  5  <r:  ( 1  -  -  A  t  ).  i.e.,  as  a  constant  fraction  of  the 
total  variance  of  the  lime  series: 

<r;=X-V  1 12) 

I  O 

We  find  the  estimate  S (/ )  by  iteration.  We  take  the 
arithmetic  average  of  l.t ,,(./  )  I :  and  |.f  i  (/ )  I :  as  an  initial 
estimate  of  S(/  ).  then  substitute  this  value  into  (11)  to 


TABLE  I  Fractional  Leakage  of  Eigentapers 


P-  Prolate 

P- 4 

n, 

!! 

P-2 

*0 

0.9999999998 

0.999999885 

0  999948125 

A  | 

0999999978 

0  999992014 

0997764652 

A  7 

0  999999008 

0  999750480 

0  962155175 

A  \ 

0999972984 

0  995477689 

0  733922358 

*4 

0  999500363 

0  951033908 

0  287339619 

A, 

0  993525891 

0  725208760 

* 

Af, 

0943750573 

0.307789684 

• 

A  - 

0.721233936 

0.060764834 

• 

Boxcar 

0  974748450 

0  966410435 

0  949939339 

*A*  ■  .05 
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produce  first  guesses  of  the  weights  <4  (/).  These  weights 
are  then  used  in  (10)  to  generate  a  new  spectral  estimate 
S(/).  and  the  process  is  repeated  Convergence  is  usually 
satisfactory  within  a  few  cycles 

Careful  examination  of  the  adaptive  spectral  estimate 
shows  that  Parseval's  theorem  is  not  explicitly  satisfied, 
i.e.,  there  is  no  requirement  that  the  energy  of  the  spec¬ 
trum  estimate,  integrated  over  frequency,  equal  the  total 
variance  of  the  time  series  This  arises  from  the  way  that 
the  multilaper  algorithm  attempts  to  compensate  for  the 
effects  of  spectral  leakage  If  the  expected  broadband  bias 
were  to  vanish,  then  (ID  would  become 
c/»  (./  I  =  A.  .  and  the  adaptive  estimate  (10)  would  reduce 
to  the  smoothed  high-resolution  estimate  (9)  (except  for  a 
small  multiplicative  factor  due  to  the  departure  of  the 
eigenvalues  A.  from  unity)  This  would  occur  if  the  true 
spectrum  were  zero  outside  the  frequency  band 
1/  /'I  If  As  •  1  a.  )  of  the  process  variance  within 

the  frequency  interval  1/  /'I-  W  is  leaked  outside  the 

band,  the  limiting  case  </,  (,/  )  -  A.  represents  an  attempt 
by  the  estimator  to  compensate  for  this  spectral  energy 
lost  to  leakage  by  boosting  the  coefficients  of  the  higher- 
order  eigenspectra  in  the  weighted  sum  When  the  spec¬ 
trum  has  a  steep  slope,  the  higher-order  eigenspectra  are 
downweighted  and  the  adaptive  spectral  estimate  tends 
toward  the  least  biased  eigenspectral  estimate  !.»•„(/ )|*. 

Thomson  [1982]  analyzed  two  synthetic  time  series  using 
mullitaper  methods  Both  series  had  fewer  than  100  data 
points  and  a  numerical  precision  of  roughly  20  bits  In  the 
first  example,  it  was  demonstrated  that  a  mullitaper 
approach  could  accurately  estimate  a  spectrum  with  a 
dynamic  range  of  more  than  seven  decades  and  accurately 
infer  the  existence  of  harmonic  lines  (i.e..  coherent 
sinusoids)  in  the  data  Thomson  also  analyzed  a  64-point 
time  series  used  by  Ka  t  and  Mar  pie  [1981]  in  a  spectrum 
analysis  shootout'1  comparing  11  spectral  estimation  tech¬ 
niques.  including  the  maximum  entropy  method  as  well  as 
a  single-taper  direct  spectral  estimate  and  several  other 
popular  spectral  estimates  Unlike  any  of  the  techniques 
tested  by  Kay  and  Marplv  [1981],  a  mullitaper  technique 
was  able  to  produce  a  spectral  estimate  which  was  similar 
to  the  true  spectrum  of  the  synthetic  time  series 

3.  Spictral  Comparisons  Uxtv,  Si-ismk  Dat  a 

We  compare  a  number  of  single-taper  direct  spectral 
estimates  with  the  adaptive  mullitaper  spectral  estimate 
techniques  on  wide  dynamic  range,  high-resolution  seismic 
data.  The  advent  of  digital  arrays  with  16-bit  data  loggers 
and  the  proposed  22-  or  24-bit  precision  instruments 
demand  an  improved  sophistication  in  data  analysiv  tech¬ 
niques.  We  may  soon  have  seismic  data  which  are 
recorded  to  the  same  precision  as  the  synthetic  examples 
of  Thomson  [ 1 982! . 

The  data  used  in  this  paper  were  recorded  on  seismom¬ 
eters  in  the  Anza  Seismic  Telemetered  Array.  The  Anza 
array  was  designed  to  record  high-frequency  seismic  sig¬ 
nals  from  local  earthquakes.  The  instruments  in  this  array 
measure  surface  velocity,  and  the  data  are  recorded  as  16- 
bit  numbers  (this  allows  a  dynamic  range  of  96  dB).  See 
Berger  er  at.  [1984]  for  a  more  detailed  description  of  the 
Anza  array 

The  mullitaper  spectral  estimate  has  a  smaller  variance 
at  each  frequency  than  a  single-taper  direct  spectral  esti¬ 


mate.  To  make  a  fair  comparison  between  the  various 
direct  spectral  estimates  and  the  adaptive  mullitaper 
method,  we  will  smooth  each  single-taper  estimate  using  a 
moving  average  so  that  each  estimate  averages  informa¬ 
tion  over  roughly  the  same  frequency  band  as  a  mullitaper 
estimate  using  seven  477  prolate  eigentapers. 

The  effect  of  smoothing  single-taper  direct  spectral  esti¬ 
mates  in  this  way  is  shown  in  Figure  4.  The  section  of  the 
seismogram  which  is  analyzed  is  shown  at  the  top  of  Fig¬ 
ure  4.  The  unsmoothed  spectral  estimates  are  shown 
below  on  the  left,  and  the  smoothed  estimates  are 
displayed  on  the  right  below  the  record.  The  upper  traces 
are  direct  estimates  using  the  Hann  taper,  the  middle 
traces  are  spectral  estimates  made  with  a  20%  cosine,  and 
the  lower  traces  are  spectrum  estimates  which  employ  a 
boxcar  taper.  The  amplitude  is  plotted  on  a  logarithmic 
scale  on  the  vertical  axis,  and  frequency  is  plotted  on  a 
linear  scale  on  the  horizontal  axis.  Each  trace  is  offset  by 
a  multiplicative  factor  of  50  from  the  adjacent  traces. 
Notice  that  if  one  studies  the  unsmoothed  spectral  esti¬ 
mates.  it  is  difficult  to  distinguish  any  specific  features 
common  to  each  of  the  estimates  except  for  a  general 
linear  trend.  In  comparison,  the  smoothed  spectral  esti¬ 
mates  have  many  of  the  same  features.  Each  major  peak 
or  trough  appears  at  the  same  frequency  in  each  of  the 
smoothed  estimates. 

Unfortunately,  since  we  are  using  real  data,  it  is  impos¬ 
sible  to  know  the  true  spectrum  for  any  of  the  examples. 
However,  the  work  of  Thomson  [1982]  demonstrates  that 
the  multitaper  method  provides  a  reasonable  spectral  esti¬ 
mate.  This  is  confirmed  by  a  study  comparing  the  multi¬ 
taper  estimate  w’ith  the  smoothed  direct  estimates  on  a 
synthetic  seismic  wave  train  with  a  known  spectrum  (C. 
Lindberg  et  al..  unpublished  manuscript,  1987). 

To  study  how  tapering  affects  the  spectra  of  body  wave 
pulses,  we  isolate  a  phase  in  the  middle  of  a  seismogram, 
produce  spectral  estimates  using  each  of  the  four  methods, 
and  compare  the  results.  The  upper  graph  in  Figure  5 
shows  the  transverse  horizontal  seismogram  of  an  earth¬ 
quake  which  had  an  epicentral  distance  of  100  km  from 
the  recording  station  PFO  (in  Pinyon  Flat,  California). 
We  extract  that  section  of  the  seismogram  corrresponding 
to  the  shear  wave  arrival  and  estimate  its  spectrum  by 
each  method.  The  spectral  estimates  are  plotted  on  a 
linear-linear  scale  in  the  lower  portion  of  Figure  5  and  for 
clarity  are  plotted  in  dimensionless  velocity  units  on  the 
vertical  axis.  Each  of  the  four  spectral  estimates  have  two 
main  peaks  in  the  frequency  band  from  0  to  20  Hz.  near  4 
and  14  Hz. 

These  estimates  are  interesting  to  compare.  Three  of 
the  estimated  spectra  (those  plotted  using  solid  and  dashed 
lines)  have  almost  identical  features  (except  for  the  offset 
between  them)  In  these  spectral  estimates  the  amplitude 
of  the  peak  at  14  Hz  is  about  20%  less  than  the  amplitude 
of  the  peak  at  4  Hz.  The  other  estimated  spectrum  (curve 
d.  plotted  with  asterisks)  does  not  resemble  the  other 
three  estimates  closely.  The  peak  at  14  Hz  is  10%  higher 
than  any  other  peak  in  this  estimate.  This  change  in  the 
relative  amplitude  of  the  two  spectral  peaks  would 
influence  the  choice  of  a  corner  frequency  if  these  spectra 
were  converted  from  velocity  to  displacement  or  accelera¬ 
tion. 

The  three  spectral  estimates  which  exhibit  similar 
characteristics  are  the  multitaper  estimate  (curve  a.  plotted 
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Fig.  4.  (Top)  Comparison  of  unsmooihed  and  smoothed  estimates  of  the  spectrum  of  a  high-freque:,cy  S  wave. 
The  spectra  are  plotted  on  a  log-linear  scale  and  are  offset  to  facilitate  comparison.  The  hoxcar  spectral  estimates  are 
graphed  with  a  solid  line.  The  dashed  lines  at  the  top  of  each  of  the  lower  figures  are  spectral  estimates  employing  a 
Hann  taper  The  middle  curves  are  spectral  estimates  obtained  using  a  20"/"  cosine  taper. 


as  a  solid  line),  the  20"/o  cosine  direct  estimate  (curve  b, 
the  upper  dashed  line),  and  the  boxcar  direct  estimate 
(curve  c,  the  lower  dashed  line).  The  spectrum  showing  a 
different  distribution  of  spectral  energy  was  estimated 
using  the  Hann  taper  (curve  d).  The  Hann  direct  spectral 
estimate  is  unlike  the  other  three  estimates  because  it 
imposes  a  different  emphasis  on  the  time  series.  Refer¬ 
ring  back  to  Figure  1,  it  is  easy  to  see  that  the  boxcar 
applies  equal  weighting  to  the  entire  time  series  and  the 
20%  cosine  taper  weights  80%  of  the  series  equally.  Not 
surprisingly,  using  either  of  these  two  tapers  produces 
essentially  the  same  result.  However,  the  multitaper  spec¬ 
tral  estimate  also  gives  essentially  equal  importance  to 
every  data  point,  like  the  boxcar  and  20"/o  cosine  estimates 
(see  Figure  2).  The  Hann  taper  puts  over  80%  of  its 
emphasis  on  the  middle  50%  of  the  time  series  and  gives 
the  data  in  the  first  and  last  25%  of  the  series  less  weight. 
This  rejection  of  data  near  the  ends  of  the  series  causes 
he  apparent  misrepresentation  of  the  distribution  of  spec¬ 
tra  energy  shown  in  Figure  5. 

We  also  compared  estimates  of  the  spectrum  of  a  verti¬ 
cal  recording  of  a  nuclear  explosion.  This  event  had  an 
epicentral  range  of  412  km  and  also  was  recorded  at  PFO. 
The  section  of  data  which  was  analyzed  is  bounded  by  the 
vertical  dashed  lines  in  the  upper  trace  in  Figure  6.  The 
analysis  procedures  were  identical  to  those  used  in  the  pre¬ 
vious  example  except  that  the  log  amplitudes  of  the  spec¬ 
tra  were  plotted  on  the  vertical  axis. 

The  spectrum  of  the  nuclear  test  has  a  large  dynamic 
range  and  has  most  of  its  energy  concentrated  below 
20  Hz.  By  examining  the  estimated  spectra,  one  can  see 
that  some  estimates  are  more  effected  by  spectral  leakage 


than  others.  The  two  estimates  which  are  less  subject  to 
spectral  leakage,  the  Hann  direct  estimate  (curve  d.  plot¬ 
ted  with  asterisks)  and  the  adaptive  multitaper  estimate 
(curve  a,  the  solid  line),  are  very  similar.  Both  of  these 
estimated  spectra  clearly  show  the  spectrum  of  the  signal 
from  0  to  20  Hz:  from  20  to  60  Hz  the  spectrum  of  the 
ground  noise  is  visible.  The  antialias  filters  of  the  record¬ 
ing  system  are  6  pole  Butterworth  filters  which  have  a 
corner  frequency  of  62.5  Hz.  The  effect  of  the  fillers  is 
visible  in  the  60-80  Hz  band.  In  the  band  from  80  to 
125  Hz  the  ground  noise  is  less  than  the  insirument  noise. 
The  variance  of  the  adaptive  multitaper  spectrum  is  larger 
in  the  low-amplitude  portion  of  the  spectrum  and  hence 
appears  unsmoothed.  This  is  because  the  downweighting 
of  the- higher-order  eigenspectra  minimizes  spectral  leak¬ 
age  at  the  cost  of  reducing  the  effective  number  of  degrees 
of  freedom  of  the  estimate  at  each  frequency.  If  smaller 
variance  is  desired  in  the  low-amplitude  portion  of  the 
adaptive  multitaper  spectrum,  then  prolate  tapers  with  a 
larger  time-bandwidth  product  could  be  used. 

The  spectra  shown  in  Figure  6  which  were  obtained 
using  the  20%  cosine  and  boxcar  tapers  suffer  from  the 
effects  of  spectral  leakage.  The  spectrum  estimate  employ¬ 
ing  the  20%  cosine  (curve  c.  the  lower  dashed  line)  suffers 
less  from  spectral  leakage  than  the  estimate  utilizing  the 
boxcar,  as  expected.  The  leakage  of  spectrum  estimated 
using  a  20%  cosine  taper  hides  nearly  all  the  features  in 
the  ground  noise  between  20  and  60  Hz.  The  effect  of  the 
antialias  filters  is  completely  obscured.  The  apparent 
energy  in  ihe  20%  cosine  spectrum  estimate  is  larger  than 
the  instrument  noise  in  the  80-  125  Hz  band  by  a  factor  of 
10.  The  performance  of  the  spectral  estimate  obtained 
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Fig.  5.  A  multitaper  spectral  estimate  (solid  line,  labeled  a)  of  the 
frequency  content  of  an  SH  wave  (lop)  is  compared  with  direct 
spectral  estimates  using  the  boxcar  taper  (fine  dashed  line,  labeled 
bl.  20%  cosine  taper  (coarse  dashed  line,  labeled  cl.  and  Hann 
taper  (asterisks,  labeled  d).  The  spectra  are  plotted  using  linear 
scales  for  the  horizontal  and  vertical  axes.  The  boxcar.  201 
cosine,  and  multitaper  estimates  of  the  5  wave  spectrum  are 
almost  identical,  but  the  Hann  taper  estimate  is  substantially 
different  This  is  because  the  Hann-tapered  spectra  overem¬ 
phasize  the  data  in  the  center  of  the  time  series  and  downweight 
data  points  toward  the  ends  of  the  record.  The  section  of  the 
time  series  which  was  analyzed  is  bracketed  by  dashed  lines  in  the 
seismogram  at  the  top. 


using  a  boxcar  taper  (curve  b.  the  upper  dashed  line)  is 
even  worse,  since  it  does  not  exhibit  any  of  the  features  of 
the  true  spectrum  between  20  and  12S  Hz. 

These  examples  show  that  each  of  the  spectral  estimates 
has  different  advantages.  The  smoothed  spectrum  esti¬ 
mate  employing  a  boxcar  taper  produces  a  good  estimate 
of  the  large-amplitude  portions  of  the  spectrum  but  has 
very  poor  spectral  leakage  properties  and  is  not  of  much 
use  for  spectra  which  have  a  large  dynamic  range.  The 
smoothed  spectrum  estimate  using  a  Hann  taper  is  less 
affected  by  spectral  leakage,  but  this  estimate  can 
misrepresent  the  large-amplitude  portion  of  the  spectrum. 
A  smoothed  spectral  estimate  incorporating  the  20% 
cosine  taper  combines  the  best  properties  of  the  spectral 
estimates  which  use  the  boxcar  and  the  Hann  tapers  It 
retrieves  the  large-amplitude  features  almost  as  well  as  the 
boxcar  estimate  and  has  spectral  leakage  properties  which 
are  sufficient  for  many  geophysical  applications.  The  adap¬ 
tive  multitaper  estimate  has  even  better  performance, 
representing  the  large-amplitude  spectral  components  as 
accurately  as  the  boxcar  estimate  and  having  excellent 
spectral  leakage  properties. 

We  have  also  made  multitaper  estimates  of  the  spectra 
of  more  than  a  dozen  events  recorded  at  local  and  regional 


distances  by  the  Anza  array.  Multitaper  techniques  like 
the  ones  presented  here  and  by  Park  el  al.  ithis  issue] 
appear  to  be  useful  tools  for  seismic  data  analysis 

4.  Statistical  Comparisons 

We  compare  the  broadband  bias  and  variance  of  the 
smoothed  single-taper  direct  spectral  estimates  with  the 
smoothed  high-resolution  and  adaptive  multiiaper  esti¬ 
mates.  We  consider  smoothed  single-taper  and  multiiaper 
spectral  estimates  whose  values  at  some  frequency  /  are 
formed  by  averaging  seven  direct  spectral  estimates  which 
concentrate  the  spectral  energy  at  frequency  /  mainly 
within  the  frequency  band  (f-WJ-  WO,  where 
W  =  4/ A'  (4  times  the  Rayleigh  frequency  1/jV).  There¬ 
fore  we  use  4tt  prolate  eigentapers  for  the  multiiaper  esti¬ 
mates;  the  seven  lowest-order  4-  eigentapers  have  good 
resistance  to  spectral  leakage  (see  Table  1).  but  we  do  not 
use  the  seventh-order  4-  prolate  eigentaper;  it  allows 
excessive  spectral  leakage,  as  K-,  —  0.721233936.  We 
compare  the  multitaper  estimates  with  a  smoothed  single¬ 
taper  estimate  Sa(J  )  which  is  formed  by  averaging  the 
seven  direct  spectral  estimates  \Xa{J'-j  A’)|:;  j=  -  3. 
-2 . 2.  3  obtained  using  a  taper  la  i.e.. 

•$,</)=  1  7  £  \X0(J~j'\)\:  (13) 

This  estimate  is  mostly  an  average  of  spectral  energy  from 
the  band  (/  -  4  .V,  /  *  4  A' I.  (The  main  lobes  of  tapers 
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Fig  6.  Comparison  of  the  leakage  of  various  estimates  of  the 
spectrum  of  a  vertical  seismogram  recorded  412  km  away  from  a 
Nevada  Test  Site  explosion.  The  spectral  estimate  using  a  cosine 
taper  (asterisks,  labeled  d)  and  the  multitaper  spectral  estimate 
(solid  line,  labeled  a)  give  good  representations  of  the  spectra  of 
the  seismic  signal  (0  -  20  Hz)  and  ground  noise  (20  bO  Hz). 
The  spectra  are  plotted  using  a  log-linear  scale 
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TABLE  2.  Statistical  Comparison 


Estimate 

7/.1 r4 
Variance 

Fractional 
leakage  ( 1  -  X ) 

Smoothed  boxcar 

1.0000 

0.0367 

Smoothed  20"'o  cosine 

1.0814 

0.0192 

Smoothed  Hann 

1.8142 

0.0093 

Smoothed  high  resolution 

1  seven  4 rr  prolate  eigentapers) 

1.0196 

0.0094 

Adapative  multitaper 
i seven  4jt  prolate  eigentapers) 

1.0004 

0.0094 

other  than  the  boxcar  taper  are  wider  than  2/.V,  so  this  is 
not  strictly  correct,  but  we  will  make  this  approximation.) 

4  1  Variance 

To  gain  some  idea  of  how  smooth  the  estimators  are, 
we  compare  the  variances  of  each  spectral  estimate  for  a 
time  series  composed  of  Gaussian  white  noise.  For 
single-taper  estimates  we  define  the  covariance  matrix 

V,  =  £ ! X* (f  -  #7  V )Xa  </  a-  ji .V ) I 

f  r  * 

(14) 

=  J  A"  (J  -  /,  .V  ),4  if  -  j  X)  df 

for  i.j  =  -3.- 2 . 2,2  (see  Thomson,  1982,  equation 

4.11,  where  it2  is  the  process  variance  defined  in  (12)  and 
.4  1./  )  is  the  spectral  window  introduced  in  (1).  If  the 
single-taper  estimates  .V, ,(/■*■  /  .V)  and  .*„(/-*•  j!  S')  are 
uncorrelated  for  i  ~j .  then  A  is  a  diagonal  matrix.  If  the 
amount  of  correlation  of  the  estimates  X„(J'~n.\)  is 
such  that  one  or  more  of  the  X„(f^ii\)  can  be 
expressed  as  a  linear  combination  of  the  others,  then  A  is 
a  singular  matrix,  with  at  least  one  zero  eigenvalue.  In 
practice.  A  has  a  behavior  which  lies  somewhere  between 
these  extremes. 

For  white  noise  data  the  expected  value  of  52  (/)  is 


£'5„  (/)!  -  ±  <r: 


for  all  1 . 


V,  ~  X  alU )  - 

1  1) 


I  V„  =  7 


[Papoulis,  1977,  chapter  11],  so 

Var|5„  (/)]  =  £  I  £ 

r  ij  .1 

For  the  smoothed  periodogram  estimate  (i.e..  direct 
estimate  using' a  boxcar  taper).  \M  =fi„ .  and  Var|5„.  (/))  = 
(ir4/7).  Values  of  (7/u-4)  Var',5^  (/)]  for  the  smoothed 
periodogram,  20%  cosine  taper  and  Hann  taper  direct 
spectral  estimates  are  tabulated  in  Table  2.  Notice  that  as 
more  data  are  discarded  by  the  taper,  the  variance  of  the 
spectral  estimate  increases. 

For  the  smoothed  high-resolution  multitaper  spectral 
estimate  (9). 

E\S(f)\  =  -I  l‘  Ua')£!U  (/)|’! 

^  k  0 

7  A.  ! 

“  I  (A*)  1 

•v  A  I) 

When  the  K  =  7  lowest-order  4n  prolate  eigentapers  are 
used,  then  E{S(J  )  1  =  ( 1 .0095) ir;.  so  the  estimate  S(J  ) 
is  mildly  biased  for  white  noise  data.  Also, 

Varl5 (/)!  =  £  1  [5 (/))-)  -  (£!5(/)]): 


£!k(/)l:l>v(/)P!  =  I  l'  (£lk (/')!-’! 

k  Ok'-  i» 

■  £  1 1>\  (/)l:!  -  Aa! 

as  the  eigentapers  are  orthonormal  (equation  (6))  so 

4x1x1  i 

Var !5(/')l  =  pr  I  I  — - 

A  "  a  0  A  0  aa 

V  J_ 

K 2  At)  Aa- 

When  the  K  =  7  Jowest-order  4tt  prolate  eigentapers  are 
used,  then  Var|5(/)1  =  <r 4(  1 .0 1 96)/ 7.  Therefore  the 
smoothed  high-resolution  multitaper  estimate  has  only 
slightly  more  variance  than  the  smoothed  periodogram 
estimate  for  Gaussian  while  noise  data. 

For  the  adaptive  spectral  estimator,  dk  for  white 

noise  data  (equation  (4.5)  of  Thomson  11982]).  Therefore 


x  1 

<r:  X  Aa 


£15(/)1  = 


and  £',S„  {J')\  =<rJ.  The  variance  of  S„  (/ )  is 

£!(5„(/))2!  -  (£!S„  (/)!)*  (15) 

The  first  term  of  (15)  is 

(  El\XJ/~  ^>PI XJ/  •  -j)Pi 

As  each  function  \X„  (/  rV;V)p  is  the  sum  of  squares  of 
two  Gaussian  random  variables,  we  can  show  that 


Var!5(/)l 


X  I  A  I 

r4  I  I  A*  Aa^aa 

k  0  k  (I 


K  1 

,r 4  I  (A* >: 

k  () 


£!  <5,  (./))-'!  -  JI  I  (\„  \„  •  |\„|’l 
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For  the  K  =  7  lowest-order  4tt  prolate  eigentapers, 
Var|S(/)|  =  fr4(  1.00038)/ 7.  Therefore  the  adaptive  mul¬ 
titaper  estimate  has  slightly  less  variance  than  the 
smoothed  high-resolution  estimate  and  slightly  more  vari¬ 
ance  than  the  periodogram  estimate. 


4.2.  Bias 


It  is  not  useful  to  compare  the  bias  performance  of 
these  spectral  estimates  for  white  noise  data.  One  is  most 
interested  in  a  measure  of  broadband  bias.  Broadband 
bias  is  caused  when  spectral  energy  at  one  frequency  leaks 
away  to  affect  the  spectral  estimate  at  a  distant  frequency 
and  is  an  important  factor  to  consider  in  the  estimation  of 
spectra  of  colored  processes. 

We  take  as  our  measure  of  broadband  bias  the  fraction 
of  energy  ( 1 — A )  in  the  frequency  band  I/-/1  <  ff'  that 
leaks  out  to  affect  the  estimated  spectrum  at  other  fre¬ 
quencies  Suppose  that  the  record  consists  of  a  single 
sinusoid,  so  that  the  spectrum  is  highly  colored  The  Ath 
eigenspectrum  retains  of  the  spectral  energy  of  the 

sinusoid  within  a  frequency  band  of  width  2fF  centered  on 
the  sinusoid  frequency.  The  fractional  leakage  of  the 
smoothed  high-resolution  spectral  estimate  is 


1  -  A 


if  f  \u\if)\:dj 

A  ()  *A  H _ 

if  J  I  l\(f)\2df 

k  (|  A  * 


If  we  use  the  seven  lowest-order  4 it  prolate  eigentapers  in 
the  estimate.  A  =  0.99057,  so  I  —  A  =  0.00943.  For  the 
adaptive  multitaper  spectral  estimate,  a  numerical  calcula¬ 
tion  shows  that  1-A  =  0.00256 
The  smoothed  single-taper  direct  spectral  estimates  are 
also  biased  when  the  process  has  a  colored  spectrum  A 
single  periodogram  estimate  allows 

1  -  A  =  I  -  I  U  ~  1  -  0.903  =  0.097 

*1  \ 

of  the  energy  of  a  single  sinusoid  to  leak  outside  its  main 
lobe.  The  smoothed  periodogram  estimate  allows 

3  V 

1-A  =1-1  t  f  \A(J'  -  ij)\2df  =  0.0367 

of  the  energy  in  |/— /'|<4/,V  to  leak  out.  For  the 
smoothed  Hann  taper  estimate,  we  find  1  -  A  ==  0.00934. 
while  the  smoothed  20%  cosine  taper  estimate  allows 
1  —  A  =  0.01 92  of  the  sinusoid's  energv  to  leak  out  of 

l/-/'kr 

The  Hann  and  20%  cosine  tapers  do  not  permit  as  much 
spectral  leakage  as  the  boxcar  taper,  but  only  the 
smoothed  Hann  taper  estimates  exhibit  broadband  bias 
characteristics  which  are  as  good  as  the  multitaper  esti¬ 
mates  Numerical  experiments  using  the  w -square  and  w- 
cube  source  spectrum  models  of  Aki  [1967]  demonstrate 
that  spectral  estimates  employing  a  boxcar  taper  are  inade¬ 
quate  for  representing  the  source  spectrum  roll-off  The 
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Fig.  7.  Comparison  of  the  variance  and  broadband  bias  of  several 
single-taper  spectral  estimates  (solid  circles)  and  the  multnaper 
estimates  (solid  triangles) 


other  tapers  are  sufficient  unless  the  spectrum  rolls  off 
more  steeply  than  /  4. 

Clearly,  for  smoothed  single-taper  spectral  estimates 
there  is  a  trade-off.  The  more  severe  the  taper,  the  less 
spectral  leakage  contaminates  the  estimate  but  also  the 
larger  the  variance  of  the  estimate.  The  multnaper  esti¬ 
mates  manage  to  defeat  this  trade-off  by  using  several 
orthogonal  leakage-resistant  tapers  in  a  single  estimate 
The  relative  variances  and  fractional  spectral  leakage  that 
are  associated  with  each  spectral  estimate  are  listed  in 
Table  2  and  are  plotted  in  Figure  7  for  comparison 

5.  Conclusions 

Mullitaper  spectral  analysis  techniques  offer  the 
seismologist  formal  and  practical  advantages  over  single¬ 
taper  techniques.  Adaptive  reweighting  of  eigenspectra 
according  to  the  predicted  level  of  spectral  leakage  enables 
well-constrained  smoothed  spectral  estimates  in  portions 
of  the  spectrum  that  have  large  amplitude,  while  retaining 
excellent  resistance  to  spectral  leakage  in  the  region  where 
earthquake  spectra  exhibit  a  steep  roll-off  Comparisons 
between  direct  spectral  estimates  produced  using  boxcar. 
Hann.  and  20%  cosine  tapers  show  that  the  boxcar  taper 
estimate  is  contaminated  by  spectral  leakage,  that  the 
Hann  taper  estimates  can  be  misleading  in  the  high- 
amplitude  portion  of  the  spectrum,  and  that  the  201  ■ 
cosine  taper  offers  a  compromise  between  these  two 
extremes.  Therefore  a  20%  cosine  taper  may  be  adequate 
in  many  cases  but  would  not  be  suitable  for  the  analysis  of 
either  an  unusually  dispersive  or  unusually  band-limited 
seismic  signal.  However,  these  pathological  situations 
present  no  difficulty  for  the  adaptive  multitaper  estimate 

There  are  drawbacks  to  using  the  multitaper  method 
The  adaptive  multitaper  algorithm  consumes  more  com¬ 
puter  time,  since  several  FFTs  must  be  computed  for  each 
time  series  and  one  needs  to  calculate  a  set  of  prolate 
tapers  for  each  time  series  length  The  computational  bur¬ 
den  is  becoming  a  less  serious  problem  as  computer 
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speeds  increase  and  (he  cost  of  computation  drops.  Also, 
we  have  found  it  adequate  to  calculate  the  prolate  eigen- 
tapers  once  for  a  time  series  of  length  128  and  generate 
tapers  of  other  lengths  .V  >  128  by  interpolating  the  128- 
point  tapers  using  cubic  splines  (see  the  appendix). 

Apimndiv  Calculating  P~  Prolatl  Eigentapers 

We  follow  a  procedure  similar  to  that  outlined  in  the 
appendix  of  Thomson  [19821  to  calculate  eigentapers  which 
have  a  given  time-bandwidth  product  ,VW/  and  length  .V. 
A  set  of  standard  tapers  of  length  ,V';>  100  is  constructed. 
We  use  V  -  128.  as  a  series  whose  length  is  a  power  of 
two  is  convenient  for  calculating  DFTs  of  the  taper.  The 
matrix  eigenvalue  problem  (5)  is  then  solved  for  the  larg¬ 
est  2.V'lf'  eigenvalues  and  the  associated  eigenvectors 
using  EISPACK  routines  TRED1 .  BISECT.  TINVIT.  and 
TRBAKI  [see  Garbo*  ei  al.  1977],  This  procedure  deter¬ 
mines  only  the  largest  eigenvalues  and  their  eigenvectors 
of  a  matrix,  avoiding  the  numerical  burden  of  fully 
decomposing  the  matrix.  In  this  manner  one  calculates 
the  prolate  eigentapers  for  a  time  series  of  length  .V'. 
Using  the  algorithm  described  by  Thomson  [1982],  one 
approximates  the  discrete  time  tapers  with  the  continuous 
time  prolate  spheroidal  wave  functions  in  order  to  set  up 
an  eigenvalue  problem  based  on  Gaussian  quadrature. 
One  obtains  discrete  tapers  at  nonuniform  sample  points 
that  can  be  interpolated  to  produce  tapers  with  even  sam¬ 
pling  and  of  a  given  length.  In  our  applications  we  have 
chosen  to  use  spline  interpolation  routines  to  interpolate 
the  evenly  spaced  tapers  of  length  V  to  produce  tapers  of 
length  A  ■  V 
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We  address  the  inverse  problem  of  finding  the  smallest  envelope  containing  all  velocity  profiles 
consistent  with  a  finite  set  of  imprecise  r(p )  data  from  a  spherical  earth.  Traditionally,  the  problem 
has  been  attacked  after  mapping  the  data  relations  into  relations  on  an  equivalent  fiat  earth.  Of  the 
two  contemporary  direct  methods  for  finding  bounds  on  velocities  in  the  flat  earth  consistent  with 
uncertain  r(p)  data,  a  nonlinear  (NL)  approach  descended  from  the  Herglotz-Wiechert  inversion 
and  a  linear  programming  (LP)  approach,  only  NL  has  been  used  to  solve  the  spherical  earth  prob¬ 
lem.  On  the  basis  of  the  finite  collection  of  r(p)  measurements  alone,  NL  produces  an  envelope 
that  is  too  narrow:  there  are  numerous  physically  acceptable  models  that  satisfy  the  data  and  violate 
the  NL  bounds,  primarily  because  (he  NL  method  requires  continuous  functions  as  bounds  on  r(p) 
and  thus  data  must  be  fabricated  between  measured  values  by  some  sort  of  interpolation.  We  use 
the  alternative  LP  approach,  which  does  not  require  interpolation,  to  place  optimal  bounds  on  the 
velocity  in  the  core.  The  resulting  velocity  corridor  is  disappointingly  wide,  and  we  therefore  seek 
reasonable  physical  assumptions  about  the  earth  to  reduce  the  range  of  permissible  models.  We 
argue  from  thermodynamic  relations  that  P  wave  velocity  decreases  with  distance  from  the  earth's 
center  within  the  outer  core  and  quite  probably  within  the  inner  core  and  lower  mantle.  We  also 
show  that  the  second  derivative  of  velocity  with  respect  to  radius  is  probably  not  positive  in  the 
core.  The  first  radial  derivative  constraint  is  readily  incorporated  into  LP.  The  second  derivative 
constraint  is  nonlinear  and  can  not  be  implemented  exactly  with  LP.  however,  geometrical  argu¬ 
ments  enable  us  to  apply  a  weak  form  of  the  constraint  without  any  additional  computation.  LP 
inversions  of  core  r(p)  data  using  the  first  radial  derivative  constraint  give  new,  extremely  tight 


bounds  on  the  P  wave  velocity  in  the  core.  The 
slightly 

Introduction 

There  are  very  few  kinds  of  geophysical  data  from 
which  we  are  able  to  draw  sound  inferences  about  deep 
earth  structure.  Most  of  the  time,  we  are  of  necessity 
content  finding  an  earth  model  that  adequately  accounts 
for  our  measurements,  disregarding  the  range  of  models 
that  predict  the  data  equally  well,  any  of  which  might 
resemble  the  actual  earth  more  closely.  Contributing  to 
the  nonuniqueness  of  the  solution  is  the  paucity  of  data 
available  versus  the  complete  description  of  the  earth  we 
seek  and  the  fact  that  our  few  data  are  inexact.  Even  if 
we  had  an  infinite  amount  of  noise-free  data,  deliberate 
approximations  in  our  assumptions  (e  g.,  that  the  earth  is 
spherically  symmetric)  may  force  us  to  treat  the  data  as 
inexact.  The  issue  of  nonuniqueness  can  sometimes  be 
resolved  by  choosing  to  optimize  some  property  of  the 
earth  model  while  fitting  the  data,  resulting  in  a  problem 
with  only  one  solution.  In  other  cases,  one  can  delineate 
the  range  of  models  that  satisfy  the  data  and  the  assump¬ 
tions  of  the  derivations.  One  such  problem  is  finding  a 
corridor  in  the  velocity-depth  plane  within  which  every 
velocity  model  satisfying  given  seismic  travel  time  data 
must  lie.  Since  travel  time  data  include  triplications  and 
other  complications,  it  is  desirable  to  work  with 
t  (p)  =  T(p)  —  p£v(p),  the  vertical  delay  time  as  a  func¬ 
tion  of  ray  parameter  p  instead.  T  is  the  travel  time,  and 
A  is  epicentral  distance  in  degrees.  Ideally,  r(p)  contains 
the  same  information  as  travel  time  data,  7" (A),  but  is  a 
monotonic  function,  continuous  except  where  there  are 
low- velocity  zones.  Estimating  r{p)  from  the  original 
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weak  second  derivative  constraint  improves  them 


records  is  a  nontrivial  matter,  but  we  shall  assume  that 
this  step  has  been  taken  successfully. 

The  inverse  problem  of  finding  the  maximum  and 
minimum  velocities  at  a  given  radius  consistent  with  a  set 
of  measurements  of  r(p)  on  a  spherical  earth  traditionally 
has  been  attacked  after  an  exact  mapping  into  a  similar 
problem  for  a  flat  earth.  (7TA)  then  becomes  T(X ), 
where  X  is  epicentral  distance  in  kilometers.)  There  are 
two  quite  different  approaches  to  the  fiat  earth  problem: 
Bessonova  et  al.  [1976]  developed  a  nonlinear  scheme 
(NL),  a  descendent  of  the  Herglotz-Wiechert  integral 
solution;  Garmany  el  al.  [1979]  transformed  the  problem 
so  that  the  data  relations  were  linear  and  solved  it  with 
linear  programming  (LP).  Both  NL  and  LP  require  the 
assumption  that  there  are  no  low-velocity  zones  in  the  flat 
earth,  i.e.,  that  dv/dz  ^  0,  where  v  is  seismic  velocity  as  a 
function  of  fiat  earth  depth  z,  or  that  the  effect  of  low- 
velocity  zones  has  been  removed  from  the  r(p)  data.  Bes¬ 
sonova  ei  al.  [1974,  1976]  discuss  how  to  preprocess  the 
data  to  remove  the  traces  of  low-velocity  zones  so  that  NL 
may  be  applied;  Orcuu  [1980]  shows  how  the  data  may  be 
prepared  similarly  for  LP  inversion. 

The  work  of  Bessonova  el  al.  [1976]  is  the  latest  in  a 
chain  of  inversions  of  travel  time  data  relying  upon  the 
Herglotz-Wiechert  integral  solution  [Aki  and  Richards. 
1980,  vol.  II,  chaplet  12];  other  notable  papers  in  the 
series  include  Gerver  and  Markushevitch  [1966],  Wiggins  et 
al.  [1973]  and  Bessonova  el  al.  [1974],  One  of  the  explicit 
aims  of  the  t  method  (NL)  of  Bessonova  et  al.  [1974]  was 
to  avoid  extrapolating  T(X)  curves  from  the  available 
finite  collection  of  measurements.  Unfortunately,  the 
extrapolation  was  merely  moved  from  T(X)  to  rip):  NL 
requires  continuous  bounds  on  r  over  a  range  of  p.  One 
could  imagine  using  nonparametric  estimates  of  r(p )  to 
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construct  continuous  bounds  in  a  consistent  way,  although 
this  is  not  what  is  done  in  practice.  Bessonova  et  at.  [1976] 
used  a  statistical  technique  for  estimating  confidence  inter¬ 
vals  for  r  at  fixed  values  of  p.  but  their  revision  of  NL 
still  needed  continuous  bounds  on  r(p)  which  they  con¬ 
structed  by  interpolating  between  the  computed  points. 
Interpolation  of  this  sort  can  violate  the  maxim  that  an 
estimate  of  a  function  should  not  deteriorate  if  more 
information  becomes  available:  the  interpolated  bounds 
for  r  would  be  pushed  out  if  an  additional  wide  confidence 
interval  for  r  were  computed  at  a  p  between  two  narrow 
confidence  intervals.  Interpolation  would  be  acceptable  if 
it  did  not  influence  the  results  unduly,  but  we  show  that  it 
can  produce  radical  changes  in  the  velocity-depth 
envelope.  This  was  not  apparent  in  earlier  comparisons  of 
LP  and  NL  [Garmany  et  al .,  1979],  It  is  very  difficult  to 
predict  the  efTeci  of  different  data  interpolation  schemes 
on  the  models  NL  finds  because  the  data  and  models  are 
nonlinearly  related.  Interpolation  may  rule  out  models 
that  satisfy  the  finite  set  of  data  and  yet  may  allow 
unphysical  models  that  violate  the  assumptions  of  the 
method.  For  example,  we  must  insist  that  candidate  velo¬ 
city  models  be  single-valued.  Since  NL  builds  its  models 
from  the  continuous  data  bounds,  the  bounds  themselves 
must  be  legitimate  r(p)  profiles,  which  in  general  they  are 
not.  When  they  are  not,  NL  modifies  the  continuous  data 
bounds  so  that  they  do  correspond  to  realizable  velocity 
models.  The  rather  ad  hoc  procedure  may  result  in  new 
"data"  that  violate  the  original  bounds. 

These  difficulties  of  NL  are  not  present  in  the  LP  for¬ 
mulation  of  the  problem.  LP  directly  incorporates  the 
constraint  that  candidate  models  be  physically  realizable:  it 
avoids  the  problems  of  multivalued  velocity  functions  in  a 
straightforward  and  consistent  fashion.  LP  escapes  the 
need  to  interpolate  by  working  in  the  other  direction: 
rather  than  construct  velocity  models  by  transforming  the 
data  bounds  (which  must  then  be  defined  over  a  continu¬ 
ous  range  of  p).  LP  examines  all  models  that  satisfy  the 
finite  collection  of  data  and  chooses  those  with  the 
greatest  and  least  velocities  at  some  depth.  It  is  possible 
to  use  LP  to  discover  the  envelope  containing  all  physical 
r(p )  profiles  that  satisfy  the  finite  data  set  since  t  itself 
could  be  used  as  the  penalty  functional,  but  this  is  not  the 
basis  of  the  inversion. 

We  would  like  to  be  able  to  incorporate  other  a  priori 
information  about  the  range  of  possible  earth  models  into 
our  inversions  to  tighten  the  velocity-depth  bounds.  The 
assumption  that  dvjdz  ^  0  is  necessary  with  LP  and  NL 
but  translates  to  an  ad  hoc  proscription  in  the  spherical 
earth:  d\/dr ^  v/r.  This  allows  low-velocity  zones  in  the 
spherical  earth  provided  velocity  increases  more  slowly 
than  radius.  A  preferable  and  more  powerful  constraint  is 
that  dv/dr^O:  no  low- velocity  zones  in  the  spherical 
earth.  We  support  this  assumption  with  thermodynamic 
arguments  applicable  to  the  outer  core  and  less  stringently 
to  the  inner  core  and  lower  mantle  (Appendix  A).  The 
new  constraint  on  the  models  is  linear  and  so  may  be 
easily  incorporated  into  LP  inversions.  This  restriction 
would  have  to  be  posed  in  terms  of  the  interpolated  t  (p ) 
bounds  to  be  used  with  NL.  Finding  the  correct  interpola¬ 
tion  is  practically  impossible  because  it  depends  non¬ 
linearly  on  all  the  data  simultaneously. 


There  are  no  inherent  limitations  to  the  accuracy  of  LP: 
the  limits  are  set  by  the  precision  of  the  machine  compu¬ 
tations  and  the  number  of  basis  functions  one  uses  to 
represent  the  earth.  This  has  been  proven  rigorously  by 
P.  B.  Stark  (unpublished  manuscript,  1986),  who  also 
proves  that  within  reason,  the  LP  results  are  insensitive  to 
the  particular  choice  of  basis  functions.  In  Appendix  B  we 
exploit  the  linearity  of  the  data  relations  in  the  LP  formu¬ 
lation  and  the  convexity  of  the  spherical  earth  to  flat  earth 
mapping  to  prove  that  the  extremal  bounds  in  the  spheri¬ 
cal  earth  are  just  the  extremal  bounds  in  the  fiat  earth 
mapped  into  the  spherical  earth.  Gerver  and 
Markushevitch's  [1966]  fiat  earth  mapping  provides  a  one- 
to-one  correspondence  between  spherical  and  fiat  earth 
velocity  models  predicting  exactly  the  same  r{p)  data  for 
all  values  of  p.  However,  when  there  is  one  velocity 
model  that  satisfies  a  finite  and  inexact  data  set,  there  are 
usually  many.  It  is  entirely  possible  that  the  velocity 
model  (there  may  be  more  than  one)  that  maximizes  the 
flat  earth  velocity  at  some  depth  while  satisfying  the  data 
might  not  be  the  model  that  maximizes  the  spherical  earth 
velocity  at  the  corresponding  spherical  earth  depth  since 
depths  and  velocities  in  the  two  domains  are  nonlinearly 
related.  Thus  the  coincidence  of  the  extremal  models  is 
less  than  obvious  although  it  has  been  tacitly  assumed 
heretofore. 

Values  of  r(p)  are  difficult  to  obtain  in  some  ranges  of 
p,  while  A  (p)  measurements  in  that  p  interval  may  be 
more  readily  available.  It  is  therefore  very  useful  to  be 
able  to  treat  A(p)  data  jointly  with  rip)  estimates.  The 
LP  formulation  may  employ  A(p)  and  r(p)  data  con¬ 
currently  lOrcutt,  1980], 

To  lest  the  LP  approach  in  the  spherical  earth,  we  have 
inverted  the  definitive  r(p)  data  set  for  the  core  [ Johnson 
and  Lee.  1985]  reduced  from  90,000  contemporary  Inter¬ 
national  Seismological  Centre  (ISC)  travel  times:  it  is 
unlikely  that  better  spherically  averaged  values  of  r(p)  for 
the  deep  interior  of  the  earth  will  become  available  for 
some  time.  Like  Johnson  and  Lee,  we  treat  the  scatter  in 
the  estimates  derived  from  the  original  T{ A)  observations 
as  statistical  noise  disturbing  an  ideal  spherically  averaged 
r(p)  curve  and  take  the  99.9%  confidence  intervals  as 
strict  bounds  on  the  uncertainties  of  the  t  values.  LP  pro¬ 
duces  generally  wider  bounds  than  NL  inversion.  This 
might  indicate  that  LP  is  too  conservative,  except  that  the 
bounds  found  by  LP  are  achievable:  for  every  velocity- 
depth  point  on  the  bounds  there  is  a  model  that  contains 
that  point  and  satisfies  the  finite  r(p )  data  exactly.  The 
NL  bounds  are  thus  sensitive  to  the  interpolation  of  the 
r  (p )  limits,  as  mentioned  earlier. 

Johnson  and  Lee  [’985]  constructed  five  r(p)  data  using 
A(p)  to  constrain  the  derivative  to  incorporate  additional 
information  in  a  range  of  p  where  r  ( p )  data  were  unavail¬ 
able.  We  compare  LP  inversions  of  their  data  with  and 
without  these  values  and  also  with  some  A (p)  data  used 
directly.  We  conclude  that  the  five  data  have  a  major 
influence  on  the  inner  core  boundary  and  determining  the 
shape  of  the  envelope  in  that  vicinity.  When  the  A  {p) 
data  are  used  directly,  a  wider  and  probably  more  reliable 
corridor  results. 

The  finite  set  of  r(p)  and  X(p)  data  is  not  very  restric¬ 
tive:  without  additional  assumptions  the  LP  bounds  are 


I?  .844 


Stark  et  al  Spherical  Earth  Seismic  Velocity  Bounds 


fairly  wide,  particularly  within  the  inner  core  where  the  flat 
earth  mapping  is  strong.  Forcing  the  first  radial  derivative 
of  velocity  to  be  nonpositive  substantially  narrows  the  LP 
bounds.  Assuming  that  the  second  radial  derivative  of  the 
P  wave  velocity  in  the  core  is  nonpositive  enables  us  to 
tighten  the  bounds  a  bit  more  using  a  geometrical  con¬ 
struction.  Both  radial  derivative  constraints  are  justified 
by  thermodynamic  arguments  in  Appendix  A.  The  final 
result,  based  on  nonpositive  first  and  second  radial  deriva¬ 
tives,  is  an  extremely  narrow  envelope  of  velocities  in  the 
core  consistent  with  the  data.  This  envelope,  roughly 
comparable  to  that  of  Johnson  and  Lee  [1985]  but  tighter 
particularly  in  the  inner  core,  is  reached  by  physical  argu¬ 
ments. 

Methods 

We  denote  velocities  in  the  radially  symmetric  spherical 
earth  model  by  v  =  v(r)  and  velocities  in  the  flat  earth  by 
>■  =  p(r).  Depth  r  is  measured  from  the  surface  of  the 
flat  earth  and  radius  r  is  measured  from  the  center  of  the 
spherical  earth.  The  velocity  at  the  surface  of  the  sphere, 
which  is  the  same  as  at  the  surface  of  the  half-space,  is 
w=  v(a)=  v(0).  The  variables  v,  r,  v,  and  z  are 
related  by  [Gerver  and  Marklishevitch,  1966] 

v-—  z  =  -a  In  (— )  (1) 

r  a 

Values  of  the  spherical  earth  ray  parameter  dT/dX  may  be 
converted  to  their  equivalent  flat  eartn  values,  dT/dX,  by 
multiplying  by  the  number  of  degrees  per  unit  distance  at 
the  surface  of  the  sphere. 

The  forward  problems  of  mapping  a  one-dimensional 
monotonic  flat  earth  velocity  profile  into  r(p)  and  X(p) 
are  solved  by  the  familiar  transformations  [Aki  and 
Richards,  1980,  vol.  II,  chapter  12]: 

t  (p)  =  2  f  (i  ■(z)~2  —  p:)  d: 

Jn 

.-I/O 

X(p)=*  2  f  p  (v(zY 2  —  p:)"  "■  dz 

Jo 

where  z(p)  is  the  turning  depth  of  the  ray  with  ray  param¬ 
eter  p,  i.e.,  the  depth  to  velocity  \/p.  Changing  the 
independent  variable  of  integration  to  i>,  we  find 

Up 

rip)  =2  f  (r~7  -  p2) d\ 

*V  dv 

i  ip 

Xip)  =  2  f  p  (v  2-  p2)  ~  dv 

*v  dv 

For  a  particular  choice  of  p  =  p,  these  integral  relations  are 
linear  functionals  of  dz/dv: 


Lc, 


r,  [£  I  =  r  (/>, )  «  2 

J*  ( V  -  P, 2 )  £  ( V )  dv 

(2) 

hr( 

AT,  [£]  =  Xip,)  -  2 

J  p,  ( v  2  -  p, 2 )  £  ( v> )  dv 

(3) 

where  £(v)  =  dz/dv  is  the  function  we  will  use  to 
represent  the  earth  model.  We  can  find  z(v)  by  integra¬ 
tion  if  we  know£(v)  and  w: 

Zv(£]  =  z  (»-)  =  /  £(v)  dv  (4) 

Z,  [£]  is  a  linear  functional  of  £(v),  and  knowing  z(v)  is 
equivalent  to  knowing  viz)  provided  there  are  no  low- 
velocity  zones,  that  is,  provided  £(v)  >  0,  so  that  both 
r(v)  and  v(z)  are  single-valued. 

We  take  certain  confidence  limits  on  rip)  and  X(p)  to 
be  strict  bounds  on  /  observations  of  r  and  m  observa¬ 
tions  of  X,  i.e..  we  assume  that  we  are  given  two  n  vec¬ 
tors  d  andd"  such  that 

d,  <  r,[£]  ^  dZ  i=l . /  (5) 

d,  4:  X,  [£]  <  d,*  /  =  /+ 1 . n  (6) 

where  n  =  l+m  is  the  total  number  of  data.  We  assume 
that  the  observations  are  ordered  such  that 
p,  <  p,+  \.  i=  1 . I—  1,  and  /  =  /+  1 . «-  1 .  The  max¬ 

imum  velocity  about  which  we  have  information  is  then 
y  =  max{  l/plt  l/p(, ,)  .  The  data  relations  expressed  in 
(5)  and  (6)  are  a  set  of  n  two-sided  linear  inequalities  in 
£.  Following  Garmany  er  al.  [1979],  we  solve  the  problem 
of  finding  strict  limits  on  the  range  of  velocities  by  deter¬ 
mining  the  range  of  depths  in  which  each  velocity  is 
allowed:  we  alternately  maximize  and  minimize  Z,  [£]  for 
each  target  velocity  v,  subject  to  the  collection  of  2 n  linear 
inequality  constraints  (5),  (6),  and  the  positivity  constraint 
£(»■)>  0.  Each  of  these  optimization  problems  is  an 
infinite-dimensional  linear  program  in  the  space  in  which 
we  decide  to  embed  £.  Appendix  B  shows  that  the  solu¬ 
tions  to  these  problems  in  the  flat  earth  are  just  the 
extremal  bounds  in  the  spherical  earth  mapped  by  equa¬ 
tions  (1),  so  we  may  solve  the  spherical  earth  extremal 
bound  problem  by  mapping  the  data  into  the  flat  earth, 
solving  the  flat  earth  problem  and  mapping  the  results 
back  into  the  spherical  earth. 

Solving  the  Flat  Earth  Extremal  Bound  Problem 

in  practice,  we  must  describe  the  unknown  earth  model 
£(v)  with  a  finite  collection  of  numbers  —  a  computer  can 
not  store  a  value  of  £(v)  for  every  value  of  v.  If  we  write 
(  as  a  linear  combination  of  a  particular  finite  set  of  basis 
functions,  the  coefficients  in  the  linear  combination  consti¬ 
tute  a  finite  description  of  the  model  a  computer  can  use. 
The  basis  set  is  acceptable  if  we  are  able  to  approximate 
the  data  and  depth  mappings  of  any  £  arbitrarily  well  by 
using  more  and  more  basis  functions  of  the  class  that  we 
choose  (equivalently,  if  the  span  of  the  basis  functions  is 
weak-star  dense  in  the  limit).  The  integrals  for  the  map¬ 
pings  may  be  performed  for  the  basis  functions  individu¬ 
ally,  and  the  resulting  numbers,  scaled  by  the  coefficients 
in  the  expansion,  may  be  added  to  give  the  value  of  the 
integrals  pet  formed  on  £  since  the  integrals  are  linear  in  £. 
We  can  avoid  numerical  quadrature  and  retain  the  highest 
accuracy  in  our  computations  if  the  integrals  can  be  per¬ 
formed  analytically  for  the  chosen  basis  functions. 

Delta  functions  are  a  natural  basis  set  for  the  flat  earth 
problem  because  they  give  rise  to  homogeneous  layers  in 
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v.  Intuitively,  we  see  that  such  layers  allow  changes  in 
velocity  to  be  made  as  early  or  as  late  as  possible.  We  can 
support  this  more  rigorously:  the  fundamental  theorem  of 
linear  programming  [ Luenberger ,  1973]  states  that  if  there 
is  an  optimal  linear  combination  of  the  basis  functions  that 
solves  the  problem,  there  is  an  optimal  solution  comprised 
of  a  linear  combination  of  at  most  as  many  basis  vectors 
as  there  are  constraints.  If  we  were  to  choose  boxcar 
functions  to  approximate  the  solution,  then  in  the  limit  as 
the  number  of  basis  functions  goes  to  infinity  and  the 
boxcars  become  vanishingly  narrow,  we  would  still  require 
at  most  2 n  basis  functions  to  represent  the  solution:  the 
solution  would  tend  to  a  sum  of  delta  functions.  (Notice 
that  in  using  a  delta  function  basis  and  finite-dimensional 
linear  programming,  we  must  widen  the  constraints  on  £ 
from  £  >  0  to  £  ^  0;  this  extension  apparently  causes 
difficulties  in  the  velocity  domain  because  we  lose  the  con¬ 
straint  that  v>  (z )  is  single- valued.  However,  if  the  solution 
is  interpreted  in  the  limit  as  the  lower  bound  on  £ 
approaches  zero,  merely  discontinuous  velocities  are  gen¬ 
erated.)  The  delta  function  basis  has  some  practical 
benefits  as  well:  the  integrals  (2),  (3),  and  (4)  are  trivial, 
and  we  may  guarantee  positivity  of  £(l)  just  by  requiring 
positivity  of  the  coefficients  in  the  basis  expansion. 

While  delta  functions  are  truly  optimal  only  when  there 
is  complete  freedom  to  place  the  layers  where  needed  (i.c., 
in  the  limit  of  an  infinite  number  of  basis  functions),  we 
can  guarantee  flexibility  near  the  target  velocity  v  where  it 
is  most  crucial  by  inserting  basis  functions  on  either  side 
of  v.  This  tailoring  of  the  basis  expansion  to  the  target 
velocity  has  produced  substantial  improvements  in  the 
computational  efficiency  over  earlier  realizations  of  LP 
[eg.,  Garmany  el  al .,  1979].  P.  B.  Stark  (unpublished 
manuscript,  1986)  provides  a  theoretical  account  of  the 
improvement. 

The  choice  of  a  delta  function  basis  for  dz/dv  suggests 
that  we  are  approximating  a  problem  where 
z  (v)  €  BV[w,y],  the  Banach  space  of  functions  of 
bounded  variation  on  the  interval  [w.y]  with  the  variation 
as  the  norm.  Our  delta  function  expansion  of  dzfdv  leads 
to  a  step  function  expansion  of  z(v)€  BV  upon  integra¬ 
tion.  Spacing  the  basis  functions  evenly  in  velocity  is  not 
desirable:  most  of  the  sensitivity  to  the  data  occurs  close 
to  the  surface  because  the  integral  (2)  depends  on  l/v.  It 
is  preferable  to  space  the  steps  evenly  in  slowness,  i.e., 
reciprocally  in  velocity.  We  have  found  that  the  numerical 
solution  is  much  more  stable  with  this  spacing,  partly 
because  it  improves  the  conditioning  of  the  mapping 
matrix.  It  can  be  proven  that  the  span  of  a  reciprocally 
spaced  set  of  step  functions  is  weak-star  dense  in 
BV\w,y)  in  the  limit,  so  a  basis  of  this  form  is  accept¬ 
able.  Our  preliminary  basis  set  shall  be 

b,  ( v )  =  8(v— v,)  (7) 

where 

!/>■  +  (L  —  j  -  2)h  7=1 . L~2 

and 

h  =  .1/ »  ~  V> 


Recall  that  w  is  the  minimum  velocity  and  y  is  the  max¬ 
imum  one.  When  we  choose  a  certain  target  velocity  v, 
we  insert  extra  basis  functions  bL_ ,  and  bL  at 
1/(1  /v  +  ah)  and  1/ ( 1/ v  —  ah),  where  a  is  a  small  posi¬ 
tive  constant,  about  0.1  typically. 

We  now  write  £,  the  unknown  earth  model,  in  terms  of 
its  basis  expansion  and  perform  the  integrals  (2),  (3),  and 
(4): 


«  =  I  8(v-  vj) 


y«  i  y- 1 

r,[£l  = 

2L  £,  (v,-2  -  A2)1- 

V 

(8) 

tFT 

II 

2L  CjPi  -  a2):/; 

JVe, 

(9) 

Z,  [£]  = 

Ic, 

J . 

(10) 

where  the  index  set  Jx  =  [j  |  v,  <x) .  Note  that  (9)  is 
unbounded  if  there  is  a  basis  function  at  v=  1/p, , 
/  €  and  so  that  particular  choice  must  be 

avoided.  This  minor  complication  has  been  resolved  in  a 
consistent  and  acceptable  fashion  by  defining  the  integrals 
on  open  intervals,  so  it  is  not  a  fundamental  limitation. 
Equations  (8),  (9),  and  (10)  may  be  written  as  vector  dot 
products: 

H 

II 

s 

tTr 

(11) 

*,[£]  =  *,  ■£ 

(12) 

z,  [£]  =  Zr  ■  £ 

(13) 

In  (1 1)  — (13),  £  is  the  vector  of  coefficients  £,  in  the  basis 
expansion  (7)  of  £, 

T’j  =  ' 

f  2  (v,  2-p,2)V:  <  1/a 

{  0  vj  >  1/a 

x,  = 

[  2p,(v-2-p,2)-':  v,  <  1/a 
[  0  Vj  2s  1/a 

Z,y  =  ' 

[  1  v,  <  V 

[  0  Vj>v 

For  a  given  v  we  wish  to  minimize 

±Z,  '  £ 

(14) 

subject  to 

Cj  >  0  7=1 . L 

d,  <  r,  •  £  <  d,+  /  =  1 . / 

d,  <  X,  £^d,+  /  =  /+l . n 

Minimizing  +Z,-£  minimizes  the  depth  to  v,  and  mini¬ 
mizing  -Z,£  maximizes  the  depth.  These  are  standard 
finite-dimensional  linear  programs,  and  software  to  solve 
them  is  widely  available.  Two  moderately  large  linear  pro¬ 
grams  must  be  solved  for  each  target  velocity  so  the  com¬ 
putational  effort  is  far  from  trivial. 
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Constraining  rm:  Radial  Di  rivaiive  oe  Velocity  where 


An  essential  assumption  of  the  foregoing  derivation  is 
that  {(v)  >  0,  i.e.,  that  there  are  no  low-velocity  zones  in 
the  flat  earth  This  becomes  the  ad  hoc  requirement  that 
d  v/ dr  ^  v/ r  in  the  spherical  earth:  there  may  be  low- 
velocity  zones.  The  low-velocity  zones  are  mild  in  the 
crust  and  upper  mantle  because  velocity  increases  no  fas¬ 
ter  than  radius,  but  in  the  core  they  can  be  major  features 
of  the  solution.  It  is  geophysically  preferable  to  assume 
that  there  are  no  low- velocity  zones  in  the  spherical  earth: 
d'./tlr  <  0.  In  Appendix  A  we  argue  that  this  should  be 
true  within  homogeneous,  adiabatic  regions  in  the  earth 
composed  of  "normal"  materials.  These  conditions  are 
generally  thought  to  apply  approximately  throughout  the 
outer  core,  and  are  probably  not  violated  within  the  inner 
core.  The  new  assumption  does  not  contradict  the  previ¬ 
ous  constraint  on  the  flat  earth  derivative;  it  is  more  res¬ 
trictive: 

0^  {(>•)  <  -  (15) 

dr  v 


F,(9)  =  2p,  (6>  —  tan ) 

t),j  =  COS  1  (p,  Vj  ) 

9,,  =  cos  1  (p,  min  ( ,  1  /  p,}) 

These  three  expressions  may  also  be  written  as  vector  dot 
products  (11),  (12),  and  (13),  with  the  new  identifications 

j  A,  ( 0,j )  -  F,  ( 9U )  v,  <  1/p, 

y,  3*  l  Ip, 

T  <  i/a 
v,  ^  1/a 

_  In  (min  {p, 1^,) /y,)  e,  <  1/p, 

Z'<  =  0  Vj  >  1/p, 

The  new  finite-dimensional  problem  with  radial  derivative 
constraints  is  to  minimize 


= 


2(9,  -  9„) 


±Z,  ■{ 


a  result  easily  obtained  from  (1).  Inequality  (15)  is  linear 
in  £,  so  it  can  be  incorporated  into  the  inversion  with  the 
same  mathematical  machinery.  However,  we  may  no 
longer  use  a  delta  function  basis  for  C,  because  delta  func¬ 
tions  violate  the  new  constraint  everywhere  they  fail  to 
vanish.  If  (15).  which  depends  on  1/v,  is  to  apply  exactly 
over  the  support  of  a  baais  function  defined  on  an  inter¬ 
val.  the  basis  function  must  also  depend  on  1/  v .  Loga¬ 
rithmic  functions  in  z(v)  correspond  to  1/v  basis  func¬ 
tions  for  £;  =  dz/dv  .  The  span  of  reciprocally  spaced  pieces 
of  logarithmic  functions  is  also  weakly  dense  in  SF(h\v] 
in  the  limit  as  tire  number  of  basis  functions  increases 
without  bound  so  this  basis  is  acceptable.  P.  B  Stark 
(unpublished  manuscript,  1986)  proves  that  the  solution  is 
not  sensitive  to  the  choice  of  basis  functions  provided  they 
get  closer  and  closer  together  as  more  of  them  are  used 
and  provided  enough  of  them  are  used.  The  logarithmic- 
functions  are  particularly  good  because  fewer  of  them  are 
required  than  of  other  basis  functions  (e  g.,  ramps)  to  get 
results  of  the  same  accuracy.  We  proceed  by  expanding  £ 
in  a  new  set  of  basis  functions  c,  and  performing  the 
resulting  integrals  (2).  (3),  and  (4).  Let 


where 


v 


T-i  -  vi 


n<v)  =  (1.0<  v  ^  1.0.  otherwise) 


The  velocities  v,  are  defined  as  in  equation  (7)  except  that 
they  have  been  put  in  increasing  order  (the  two  velocities 
bracketing  the  target  velocity  are  no  longer  at  the  end  of 
the  list).  Now 

r,  (Cl  =  £  c  1/  W  >  -  F,  (9  )] 


A',  (C 1 
ZM 1 


<#„  -  it,,  > 


Z  i,  In 


min  ( v . v, .  ] ) 

V, 


subject  to  the  constraints 

a  3*  £,  >  0  y-1 . L 

dr  <  t,  £  <  dF  i  =  1 . I 

dr  ^  X, ^  dr  i  =  /+!,  ....  n 

Minimizing  +Z,-£  minimizes  the  depth  to  v.  and  mini¬ 
mizing  ~Z,  ■{  maximizes  the  depth.  These  are  also  stan¬ 
dard  linear  programs  and  may  be  solved  straightforwardly. 

Application  to  the  Earth  s  Core 

We  have  applied  the  theory  to  r(p)  data  for  the  core 
obtained  by  Johnson  and  Lee  [1985]  (Figure  1).  We  follow 
Johnson  and  Lee  in  interpreting  the  scatter  of  the  data  as 
a  noise  process  distorting  an  ideal,  spherically  averaged 
data  curve.  The  data  bounds  determined  by  Johnson  and 
Lee  are  the  99.9%  confidence  limits  on  r(/i)  which  both 
we  and  they  lake  to  be  firm  bounds  within  which  r  must 
lie.  P.  B.  Stark  and  R.  L.  Parker  (unpublished  manuscript. 
1986)  have  developed  a  method  of  inverting  the 
confidence  interval  data  without  reinterpreting  the  inter¬ 
vals  as  strict  bounds.  The  results  of  inversion  with  the 
statistical  treatment  of  the  bounds,  though  different  in 
detail,  are  surprisingly  similar  to  the  results  assuming  strict 
data  bounds.  In  reality,  the  noise  components  of  the 
observations  are  probably  not  statistically  independent 
because  they  are  principally  the  result  ol  large-scale 
heterogeneities  in  the  earth  and  anomalies  associated 
the  sources  and  receivers,  the  actual  statistics  o!  the  - 
values  are  largely  unknown  and  almost  cert.unU 
Gaussian 

The  inversions  that  follow  used  1  (Mi  basis  tuiut ..•■  - 
the  preliminary  expansion  and  an  additional  pa  ■  •  *'•  ,.  > 

the  target  velocity  Using  2(H)  did  not  noi..c.m  >  .  •  , 

the  results;  the  bounds  have  converged  app.or- 
inversions  started  at  a  radius  ot  ''4RO  cr  1  i; 
which  Johnson  and  lee  corrected  ;hr  l. 

PRLM  anisotropic  earth  model  !/>.s  »■•••  • 

1981]  We  used  a  minima”:  sc  •.  . 
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Fig.  1.  The  rip)  data  of  Johnson  and  Lee  [19851  replotled.  The 
ray  parameters  are  the  equivalent  flat  earth  values  having  been 
scaled  by  18(Vrr  and  divided  by  a  core  radius  of  3480  km. 

models  from  an  inversion  with  target  velocity 
v- 200  km  s'1.  The  effect  of  the  flat  earth  mapping  is 
seen  in  the  large  values  of  velocity  (250  km  S'1)  and 
depth  (10,000  km)  as  radius  tends  to  zero. 

We  first  found  bounds  on  the  velocity  in  the  core 
assuming  that  only  the  flat  earth  velocity  gradient  is  non- 
negative,  (i.e.,  dv/dr  <  v/r),  the  identical  situation  treated 
by  Johnson  and  Lee.  The  solid  line  in  Figure  3  is  the  LP 
solution  using  all  of  Johnson  and  Lee  s  [1985]  r  ip )  values; 
the  dashed  line  is  the  LP  solution  excluding  the  five 
values  constructed  with  A ip).  The  envelopes  are  deter¬ 
mined  by  about  180  target  velocities.  The  shaded  region 
is  Johnson  and  Lee’s  NL  solution  based  on  all  their  data. 
Though  both  approaches  begin  with  the  same  rip,)  data, 
many  more  solutions  are  accessible  to  the  LP  technique 
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Fig.  2.  Two  extremal  solutions  in  the  equivalent  flat  earth,  one 
(solid  line)  minimizing  the  depth  to  a  velocity  of  200  km  sr 1 ,  the 
other  (dashed)  maximizing  it.  Notice  the  extremely  large  values 
of  velocity  and  depth  from  the  exponential  mapping. 


Fig.  3.  Velocity  bounds  calculated  without  requiring  v  to  decrease 
with  radius.  The  shaded  zone  is  the  solution  of  Johnson  and  Lee 
[19851  obtained  by  NL  using  all  30  data  of  Figure  1.  The  solid 
line  is  our  LP  solution  based  upon  the  same  data,  notice  how 
much  wider  the  bounds  found  by  LP  are.  The  dashed  bounds  are 
obtained  when  the  five  interpolated  values  indicated  in  Figure  1 
are  omitted;  the  expansion  of  the  bounds  shows  how  crucial  these 
five  data  are  to  the  solution  near  the  inner  core  boundary. 

than  to  NL.  For  each  point  on  the  LP  bounds  there  is  a 
velocity  model  that  satisfies  the  measured  t  (p )  data  and 
contains  that  velocity-depth  point.  We  conclude  that  the 
NL  bounds  are  narrower  than  the  data  require  because 
they  exclude  models  that  satisfy  the  original  finite  list  of 
rip)  data.  The  jagged  excursions  in  the  LP  bounds  are 
due  to  the  finite  size  of  the  data  set:  between  data,  where 
the  constraints  on  the  model  are  not  so  demanding,  LP  is 
free  to  make  the  depth  smaller  or  greater  as  asked.  One 
should  note  that  even  if  the  inner  excursions  of  the  LP 
bounds  were  connected,  simulating  the  effect  of  interpolat¬ 
ing  the  rip)  data,  the  resulting  envelope  would  still  lie 
outside  the  NL  bounds.  Neither  NL  nor  LP  claims  that 
the  bounds  themselves  are  reasonable  velocity  models,  but 
rather  that,  in  the  absence  of  additional  information,  each 
velocity-depth  point  on  the  bounds  is  contained  in  some 
velocity  model  that  does  satisfy  the  data.  The  NL  solution 
does  not  exhibit  the  same  ragged  behavior  as  LP  because 
the  rip)  data  have  been  first  interpolated  to  get  continu¬ 
ous  bounds.  It  follows  from  the  wide  difference  between 
the  results  of  the  two  approaches  that  the  NL  answer  is 
sensitive  to  the  precise  interpolation  procedure  employed. 
Unfortunately,  it  is  clear  that  although  technically  correct, 
the  LP  bounds  are  too  wide  to  be  very  interesting.  This  is 
not  a  fault  of  LP;  on  the  contrary,  it  shows  that  the 
present  list  of  rip)  data  is  insufficient  to  bound  velocity 
very  well:  if  we  want  tighter  bounds,  we  must  either  make 
additional  assumptions  or  introduce  more  data  with  the 
same  high  accuracy. 

Johnson  and  Lee  [1985)  used  A (p)  values  from  the 
Tonto  Forest  Seismological  Observatory  to  construct  five 
values  of  rip)  to  refine  the  envelope  near  the  inner  core 
boundary.  The  sensitivity  of  the  bounds  in  that  vicinity  to 
these  five  rip)  data  is  clear  from  comparison  of  the  solid 
and  dashed  bounds.  There  is  no  need  for  us  to  interpolate 
to  incorporate  the  additional  information  since  LP  can 
exploit  A  ip)  data  directly.  The  A  ip)  data  are  shown  in 
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Fig.  4.  A  (p)  data  of  Johnson  and  Lee  [19851  shown  as  circles. 
The  solid  curve  shows  the  predictions  of  the  1066A  earth  model. 
We  have  assigned  uncertainties  and  adopted  values  at  five  ray 
parameters  for  our  inversion;  the  numerical  values  appear  in 
Table  1. 

Figure  4  as  circles,  along  with  the  A (p)  predictions  of 
model  1066A  [Gilbert  and  Dziewonski,  1975]  and  our  crude 
assessment  of  errors  in  the  range  of  p  where  r  was  not 
available.  The  error  bounds  are  based  upon  the  variance 
of  seven  A  (p )  points  at  about  the  same  p .  These  conser¬ 
vative  values  were  conferred  because  so  few  measure¬ 
ments  were  available  and  because  the  A  (p)  data,  from  an 
array  in  central  Arizona,  are  not  expected  to  be  completely 
representative  of  the  spherically  averaged  earth,  We 
adjusted  the  A(p)  values  to  3480  km  by  subtracting  the 
effects  of  the  crust  and  mantle  of  the  PREM  anisotropic 
earth  model  to  be  consistent  with  Johnson  and  Lee;  the 
reduced  data  are  given  in  Table  1 . 

We  can  tighten  the  bounds  by  making  the  assumption 
that  d\/dr  <  0,  as  described  earlier  and  justified  thermo¬ 
dynamically  in  Appendix  A.  The  dashed  bounds  in  Figure 
5  are  determined  from  the  25  rip)  data,  while  the  solid 
bounds  include  the  five  A  (p )  constraints  from  Figure  4; 
the  only  noticeable  difference  is  in  the  vicinity  of  the  inner 
core  boundary.  The  shaded  zone  is  the  region  obtained 
from  all  30  r(p)  data  of  Johnson  and  Lee  [1985].  The 


TABLE  1 

Five  Values  of  A  for  Rays  Passing  Near  the 
Inner  Core  Boundary 

p  ,  s  deg" 1 

p ,  s  km- 1 

A.  deg 

1.775 

0.02922 

119.752 

1.830 

0.03013 

115.003 

1.880 

0.03095 

98.026 

1.983 

0.03265 

109.682 

2,033 

0.03347 

112.670 

Each  A  value  is  assigned  an  uncertainty  of  ±16.12 
degrees.  The  values  have  been  reduced  to  the  surface 
of  the  core  (radius  3480  km)  by  subtracting  the  pred¬ 
ictions  of  the  PREM  mantle  from  the  observations 
shown  in  Figure  4. 


Fig.  5.  Velocity  bounds  in  the  core  when  v  is  required  to  decrease 
with  increasing  radius.  The  shaded  zone  is  obtained  using  all 
thirty  data  of  Figure  1.  The  region  bounded  by  dashed  lines  is 
found  when  the  five  interpolated  data  are  omitted;  that  bounded 
by  the  solid  lines  results  from  replacing  the  interpolated  t  values 
by  the  A  values  of  Figure  4. 


solid  line  represents  our  preferred  solution  since  we  would 
rather  use  the  A  (p )  data  directly.  Each  pair  of  bounds  is 
determined  by  about  180  points.  Figure  6  superposes  the 
bounds  from  25  rip)  data  with  and  without  the  radial 
derivative  constraint.  The  power  of  the  radial  derivative 
constraint  is  immediately  evident:  the  corridor  is  extremely 
narrow,  even  in  the  inner  core  where  the  exponential 
mapping  takes  its  greatest  toll.  An  important  reason  con¬ 
straints  of  this  sort  are  so  powerful  is  that  they  constrain 
the  model  at  each  point,  whereas  data  constrain  integrals 
of  pieces  of  the  models. 

We  would  like  to  require  the  second  radial  derivative  of 
velocity  in  the  core  to  be  nonpositive  (Appendix  A). 
Manipulation  of  the  flat  earth  mapping  (1)  shows  that 
d2\/dr2^  0  is  equivalent  to  -(dz/dv)2^,  ad2z/dv 2,  a  non¬ 
linear  constraint  in  this  formulation  that  cannot  be 
imposed  exactly  in  LP.  However,  we  can  use  a  geometri¬ 
cal  argument  to  rule  out  some  of  the  corridor  allowed  by 
the  first  radial  derivative  constraint.  It  must  be  possible  to 
join  any  point  within  bounds  incorporating  the  second 
derivative  constraint  to  both  ends  of  the  envelope  with  a 
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Fig.  6.  A  comparison  of  the  velocity  bounds  with  and  without  the 
constraint  that  v  decrease  with  radius  using  the  25  r  values  in  Fig¬ 
ure  1  not  based  upon  interpolation. 
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Fig.  7.  Most  restricted  solution  discussed  in  this  paper,  shown  by 
the  large  dashes.  The  velocity  must  decrease  with  radius  and  the 
weak  form  of  the  constraint  that  d2\/dr2  <  0  has  been  applied 
separately  in  the  outer  and  inner  core.  The  data  comprise  25  r 
and  five  A  values.  The  region  contained  by  the  solid  lines  omits 
the  constraint  on  the  second  derivative  of  v.  The  PREM  model  is 
plotted  lightly  dashed.  PREM  does  not  satisfy  the  r  (p)  data. 

curve  that  is  never  concave  up  as  our  figures  are  drawn. 
We  can  therefore  exclude  anything  to  the  left  of  line  seg¬ 
ments  intersecting  the  lower  bound  twice.  The  upper 
bound  is  a  bit  more  complicated:  we  can  exclude  anything 
separated  from  the  rest  of  the  envelope  by  a  line  that 
intersects  the  upper  bound  twice  and  the  lower  bound 
once.  Figure  7  shows  the  results  of  this  proscription  and 
is  our  most  constrained  solution  based  upon  r(p)  and 
A  (p)  observations.  (We  have  plotted  the  PREM  model 
for  reference;  it  lies  within  the  bounds  but,  as  Johnson 
and  Lee  demonstrated,  does  not  satisfy  the  rip)  data.) 
Although  the  correct  bounds  incorporating  the  second 
derivative  constraint  are  undoubtedly  narrower,  the 
geometrical  consideration  allows  us  to  rule  out  some  solu¬ 
tions  without  solving  a  nonlinear  problem.  The  same  type 
of  argument  applies  to  the  first  radial  derivative  constraint: 
it  must  be  possible  to  join  any  point  in  the  corridor  to 
both  ends  of  the  envelope  with  a  curve  that  is  everywhere 
nonincreasing  with  radius.  Applying  this  principle  to  the 
bounds  based  on  only  the  flat  earth  constraint  gives  much 
wider  bounds  than  does  proper  use  of  the  radial  con¬ 
straint. 

Conclusions 

We  have  presented  a  theory  and  an  algorithm  for 
finding  the  best  possible  envelope  of  velocities  in  a  spheri¬ 
cal  earth  consistent  with  a  finite  number  of  r(p)  and  A  ip) 
observations  whose  uncertainties  are  expressed  as  strict 
intervals.  The  solution  begins  by  mapping  the  sphere  into 
a  half-space  in  which  velocity  varies  only  with  depth,  the 
equivalent  flat  earth  problem.  This  mapping  has  been 
used  before,  but  we  show  for  the  first  time  that  maximiz¬ 
ing  or  minimizing  the  depth  functional  in  the  flat  earth 
always  leads  to  a  corresponding  extremum  in  the  spherical 
system,  even  though  a  nonlinear  transformation  of  sets 
has  taken  place. 

We  use  the  linear  programming  approach  of  Garmany  et 
al.  [1979)  to  construct  a  corridor  in  the  velocity-depth 


plane  that  contains  all  models  consistent  with  the  data  and 
the  constraint  that  there  are  no  low-velocity  zones  in  the 
flat  earth.  The  original  formulation  has  been  improved  in 
several  ways:  we  have  added  the  ability  to  include  A(p) 
data  in  the  manner  of  Orcutt  [1980],  and  we  have  shown 
how  to  select  a  set  of  basis  functions  in  the  numerical 
approximation  of  the  problem  so  that  precise  bounds  are 
found  with  a  relatively  small  number  of  layers. 

The  requirement  that  there  be  no  low- velocity  zones  in 
the  flat  earth  leads  to  the  ad  hoc  restriction  in  the  spheri¬ 
cal  system  that  dv/dr  <  v/r.  This  inequality  permits  low- 
velocity  zones  of  increasing  intensity  as  one  approaches 
the  center  of  the  earth.  We  show  by  thermodynamic  rea¬ 
soning  that  P  wave  velocity  in  the  core  should  increase 
with  depth  so  that  dv/dr  <  0.  The  constraint  remains 
linear  when  mapped  into  the  flat  earth  and  so  may  be 
readily  included  in  the  linear  programming  formulation. 
We  also  show  that  it  is  quite  likely  that  d2\/dr 0  in  the 
core.  This  condition  can  not  be  mapped  into  a  linear  con¬ 
dition  in  the  alternative  domain  and  so  it  has  not  been 
fully  exploited;  a  weak  form  of  the  condition  can  be 
enforced  without  any  additional  computation. 

We  have  tested  the  theory  with  r(p)  data  for  the  core 
prepared  by  Johnson  and  Lee  [1985]  and  used  by  them  to 
constrain  the  P-wave  velocity  profile.  Our  technique  pro¬ 
duces  numerous  velocity  models  that  satisfy  the  finite  list 
of  t(p)  data  but  lie  outside  Johnson  and  Lee’s  bounds; 
the  LP  bounds  are  much  wider.  The  data  alone  are 
insufficient  to  resolve  velocity  well,  so  it  is  desirable  to  add 
information  to  the  inversion  by  making  additional  assump¬ 
tions  about  the  earth.  Adding  the  physical  constraint  on 
the  velocity  gradient  tightens  the  LP  bounds  considerably, 
bringing  them  inside  the  original  corridor  of  Johnson  and 
Lee.  Our  results  suggest  that  the  P  wave  velocity  can  be 
determined  to  an  accuracy  of  better  than  ±0.25  km  S"1 
almost  everywhere  in  the  outer  core  and  ±0.1  km  s’1  in 
a  large  part  of  the  inner  core.  We  can  show  by  the  same 
kind  of  geometrical  argument  used  to  apply  the  second 
derivative  constraint  that  the  inner  core  boundary  must  lie 
between  1207  and  1242  km  if  it  is  a  simple  discontinuity. 

Appendix  A:  Physical  Restrictions  on  the 
Velocity  in  the  Outer  Core 

It  has  been  accepted  that  the  bulk  of  the  outer  core  is 
very  nearly  adiabatic  and  homogeneous  since  the  work  of 
Birch  [1952],  Free  oscillation  data  do  not  indicate  any 
significant  departures  from  this  state  [Masters,  1979]  and 
so  modern  earth  models  tend  to  build  in  these  properties. 
We  can  constrain  the  first  and  second  radial  derivatives  of 
Vp{r),  the  compressional  velocity  as  a  function  of  radius, 
if  we  assume  that  the  outer  core  is  adiabatic  and  homo¬ 
geneous  and  core  material  is  "normal."  (In  this  appendix 
only  we  adopt  symbols  common  in  geophysical  thermo¬ 
dynamics:  p  is  pressure,  not  ray  parameter,  Vp  will  be 
used  for  P  wave  velocity  and  T  for  absolute  temperature; 
we  believe  this  is  less  confusing  than  using  unfamiliar 
symbols  for  these  variables.)  By  normal  we  mean  that 
( BKs/dp)s ,  the  isentropic  change  in  the  bulk  modulus 
with  pressure,  is  about  3—4,  and  decreases  slowly  with 
isentropic  compression.  This  assumption  is  supported  by 
the  finite-strain  fits  to  the  properties  of  the  outer  core 


13,900 


Stark  et  al.:  Spherical  Earth  Seismic  Velocity  Bounds 


done  by  Davies  and  Dziewonski  [1975],  who  found  that 
dKs/dp  decreases  from  about  3.6  at  the  core-mantle 
boundary  to  about  3.45  at  the  inner  core  boundary. 
Assuming  hydrostatic  equilibrium. 


dVP 

dKs 

-  1 

dr  sc  2VP 

dp 

S.C 

where  the  subscripts  S  ,C  denote  constant  entropy  and 
composition  and  g  is  the  acceleration  due  to  gravity. 
Since  ex  hypothesi  (dKs/dp)s.c  ^  1,  we  find  immediately 
that  the  first  radial  derivative  of  Vp  is  negative.  This 
would  be  the  first  result  we  need  if  the  core  had  an  adia¬ 
batic  temperature  gradient  and  uniform  composition;  we 
examine  the  effects  of  departures  from  these  conditions 
shortly. 

Differentiating  (Al)  gives 
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Gravitational  acceleration  g(r)  is  relatively  insensitive  to 
details  of  the  density  distribution  within  the  earth  and  is 
very  well  determined.  The  derivative  dg/dr  is  almost  cer¬ 
tainly  positive  in  the  outer  core  so  the  first  term  on  the 
right  side  of  (A2)  is  negative.  The  second  term  is  much 
smaller  than  the  first  because  ( dKs/dp)s.c  is  relatively 
insensitive  to  p  but  is  negative  if  the  material  is  normal. 
Thus  the  second  radial  derivative  of  Vp  is  also  negative 
given  these  assumptions. 

How  sensitive  are  these  constraints  to  the  assumption 
that  the  temperature  gradient  is  adiabatic?  It  is  extremely 
unlikely  that  the  outer  core  can  be  significantly  superadia- 
batic  as  the  resulting  convective  instability  would  relieve 
the  condition  through  convection  [Masters,  1979]  resulting 
in  an  adiabatic  interior  with  thin  (seismically  unobserv¬ 
able)  boundary  layers.  To  examine  the  effect  of  a  subadi- 
abatic  gradient,  we  write  the  temperature  gradient  in  the 
following  form: 


dT 

dr 


dr 

dr 
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(1  -/) 


(A3) 


son  [1967],  Like  (dKs/dp)s.c ■■  8,  is  relatively  insensitive 
to  pressure  and  temperature.  In  a  fluid,  8,  can  be  written 


Sj 


1- 


2  dVP 
aVP  dT 


We  conclude  that  8,  ^  1  as  (d  Vp/d  T)pX  is  negative  in 
almost  all  materials.  In  fact,  experiments  usually  show 
that  8j  =  (dKsldp)s.c  so  it  is  reasonable  to  suppose  that 
1<8S<4  in  the  outer  core.  Using  (Al),  (A5)  can  be 
written 


where 


X 


8-1 


dKs 

dp  s.c 


-  1 


1 


i.e. ,  both  /  and  X  probably  lie  between  zero  and  one. 
Equation  (A6)  says  that  dVp/dr  will  be  slightly  more  nega¬ 
tive  in  a  subadiabatic  region.  Reasonable  estimates  of 
a,7\  and  y  (e.g.,  Stacey ,  1977]  suggest  that  aTy  is  about 
0.05  in  the  outer  core,  so  the  velocity  gradient  would  be 
changed  by  about  5%  in  an  isothermal  region. 

Differentiating  (A6)  with  respect  to  radius  yields 


dlVP 

dr2 


dVr 

dr 


- T>TyfX ) 


(A7) 


The  first  term  on  the  right  of  (A/)  is  negative,  but  the 
second  term  could  cancel  it,  giving  a  nonnegative  value  of 
d.2Vp/dr2\  we  can  show  that  this  is  unlikely. 

The  only  difficulty  is  the  unknown  radial  variation  of  /. 
The  radial  variation  of  aTyX  is  dominated  by  the 
behavior  of  a,  a  rapidly  decreasing  function  of  pressure. 
Therefore  d(aTyX)/dr  is  positive.  Equation  (A7)  shows 
that  if  /  is  constant  or  df/dr  is  positive  (i.e.,  the  core  is 
increasingly  stable  at  larger  radius),  the  second  radial 
derivative  of  Vp  remains  negative.  It  may  only  become 
nonnegative  if  df/dr  is  large  and  negative:  d2Vp/dr2  =  0 
when 


where  /  is  a  function  of  radius.  In  the  isothermal  case 
/«  1  and  since  we  have  ruled  out  super-adiabatic  gra¬ 
dients,  0^/<  1.  The  adiabatic  temperature  gradient  is 
given  by 


dT  _  -gTy  _  -gTa 

dr  sc  V2  C„ 


(A4) 


where  y  is  Gruneisen’s  ratio,  a  is  the  coefficient  of  ther¬ 
mal  expansion  and  Cp  is  the  specific  heat.  With  these 
relations  and  the  assumption  of  homogeneity  one  may 
show 
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The  temperature  dependence  of  Vp  is  unknown  but  may 
be  estimated  using  the  parameter  8,  introduced  by  Ander- 
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1  ds_ 
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dp 
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=  --Hr5  nr1 

For  this  to  occur  the  core  must  go  from  an  adiabatic  state 
to  an  isothermal  state  in  a  radial  distance  of  about  100  km. 

In  summary,  subadiabatic  temperature  gradients  cause 
the  velocity  gradient  to  steepen  slightly  (becoming  more 
negative)  and  leave  the  second  radial  derivative  of  velocity 
negative  provided  the  core  is  uniformly  stable  or  becomes 
more  stable  toward  the  core-mantle  boundary.  The 
second  radial  derivative  could  become  nonnegative  if  the 
core  becomes  stable  within  a  very  small  range  of  radius  as 
one  approaches  the  inner  core  boundary,  but  we  consider 
this  case  unlikely. 
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What  is  the  effect  of  variations  in  the  composition?  We 
make  the  usual  assumption  that  the  outer  core  is  predom¬ 
inantly  molten  iron  with  small  amounts  of  light  impurities. 
This  model  is  based  partly  on  shock  wave  experiments 
which  indicate  that  the  outer  core  is  about  10%  less  dense 
than  pure  iron  [e.g.,  Jeanloz,  1979].  The  same  experi¬ 
ments  show  that  the  bulk  sound  speeds  (which  equal  Vt 
in  a  fluid)  of  the  outer  core  and  of  iron  at  core  pressures 
are  virtually  indistinguishable.  It  seems  that  the  effect  of 
impurities  is  to  lower  Ks  as  much  as  they  lower  the  den¬ 
sity,  resulting  in  little  effect  on  the  velocity.  In  this  case, 
reasonable  radial  variations  in  the  composition  of  the 
outer  core  would  not  affect  the  negativity  of  the  first  and 
second  radial  derivatives  of  velocity.  One  can  use  the 
analysis  of  Jeanloz  [1979]  to  estimate  (d( /p/dc)T.p,  where 
c  is  the  concentration  of  light  impurities,  and  to  make  an 
analogous  argument  to  the  one  for  thermal  variations 
given  above.  The  results  are  similar,  and  we  conclude  that 
the  constraints  on  the  radial  derivatives  are  unlikely  to  be 
violated  for  reasonable  chemical  or  thermal  departures 
from  the  adiabatic  and  homogeneous  state. 

Appendix  B: 

Equivalence  of  Two  Optimization  Problems 

It  is  not  difficult  to  show  that  the  extrema  of  z(v)  occur 
for  the  same  models  dz/dv  as  the  extrema  of  r  (v).  We  do 
this  by  showing  that  the  variations  of  r  (v)  and  z(v)  differ 
by  a  multiplicative  constant,  so  that  the  perturbations  to 
dz/dv  one  should  make  to  improve  the  values  of  the 
penalty  functionals  have  the  same  direction.  Since  the 
linear  programming  solution  for  r(v)  is  optimal,  that  is 
the  functional  derivative  projected  onto  a  given  direction 
either  vanishes  or  leads  outside  the  constraint  set,  r(v) 
can  not  be  improved  locally  without  leaving  the  region  of 
dz/dv  that  fit  the  data.  We  write  r  as  a  functional  of 
£2  dz/dv: 


where 


Z,  [£ 1  =  J*  £  dv  with  >•  =  vr  (£  ]/a 
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Using  (Bl)  and  the  definition  of  v, 

vd-a/u)  .n-a/ui 

A  =  J  8  dv  +  J*  £  dv  +  J"  8  dv 

The  third  term  is  second  order  in  8 ,  and  the  second  term 
may  be  approximated  by  -£(v)A /a  for  £  sufficiently 
smooth  and  A/a  sufficiently  small.  Note  that  for  the 
finite-dimensional  realization  of  the  problem  that  we 
solve,  this  requires  that  the  perturbation  be  sufficiently 
small  not  to  cause  the  upper  limit  of  integration  to  move 
into  the  support  of  a  different  basis  function,  or  that  the 
coefficients  of  the  two  match  to  give  a  continuous  transi¬ 
tion.  To  first  order, 

A  ~  J  8  dv  -  £  ( v )  A/ a 

If  we  solve  for  A  and  substitute  the  result  into  equation 
(Bl),  we  find  that 

r  l£+8]  —/•[£]=  \  8  dv 

a  +  £  ( v )  J 

However, 

Z,  [£+6]  -  Z,  [£]  -  j  8  dv 
so  we  may  write 

r[£+6i—  r(£]  =  (Z,  fc+S]- Z,  {£]) 

a  +  £  ( v ) 

The  negative  sign  is  expected  since  maximizing  the  depth 
minimizes  the  radius.  This  shows  that  any  extremum  of 
Z,  over  the  set  of  £  satisfying  the  constraints  is  also  an 
extremum  of  r.  We  must  still  show  that  r  does  not 
achieve  a  "better"  value  for  any  other  £  satisfying  the  con¬ 
straints. 

The  functional  r[£]  is  convex  since  it  is  the  composition 
of  two  convex  mappings:  a  linear  mapping  (Z,  [£])  and  an 
exponential.  The  set  of  models  £(v)  that  satisfy  the  data 
and  the  radial  derivative  constraint,  if  it  is  used,  is  convex 
because  it  is  described  by  linear  inequalities.  The  familiar 
theorem  concerning  the  extrema  of  convex  functionals 
over  convex  sets  applies:  the  value  achieved  at  a  local 
extremum  of  r[£]  satisfying  the  data  is  the  global  extremal 
value  with  respecl  to  the  set  of  feasible  solutions.  We  are 
therefore  justified  in  solving  the  problem  by  finding  the 
extrema  of  z(v)  and  mapping  the  resulting  values  into 
values  of  r  (v). 
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We  present  a  new  technique  for  constructing  the  narrowest  corridor  containing  all  velocity 
profiles  consistent  with  a  finite  collection  of  r(p)  data  and  their  statistical  uncertainties.  Earlier 
methods  for  constructing  such  bounds  treat  the  confidence  interval  for  each  r  datum  as  a  strict 
interval  within  which  the  true  value  might  lie  with  equal  probability,  but  this  interpretation  is 
incompatible  with  the  estimation  procedure  used  on  the  original  travel  time  observations.  The  new 
approach,  based  upon  quadratic  programming  (QP),  shares  the  advantages  of  the  linear  program¬ 
ming  (LP)  solution:  it  can  invert  rip)  and  X(p)  data  concurrently;  it  permits  the  incorporation  of 
constraints  on  the  radial  derivative  of  velocity  for  spherical  earth  models;  and  theoretical  results 
about  convergence  and  optimality  can  be  obtained  for  the  method.  We  compare  P  velocity  bounds 
for  the  core  obtained  by  QP  and  LP.  The  models  produced  by  LP  predict  data  values  at  the  ends 
of  the  confidence  intervals;  these  values  are  unlikely  according  to  the  proper  statistical  distribution 
of  errors.  For  this  reason  the  I.P  velocity  bounds  can  be  wider  than  those  given  by  QP,  which 
takes  better  account  of  the  statistics.  Sometimes,  however,  the  LP  bounds  are  more  restrictive 
because  LP  never  permits  the  predictions  of  the  models  to  lie  outside  the  confidence  intervals  even 
though  occasional  excursions  are  expected.  The  QP  bounds  grow  narrower  at  lower  levels  of 
confidence,  but  the  corridors  at  95%  and  99.9%  are  virtually  indistinguishable-  The  data  must  be 
improved  substantially  to  make  a  significant  change  in  the  velocity  bounds. 


Introduction 

This  paper  is  a  sequel  to  Stark  et  al.  [19861,  (hereinafter 
called  SPMO);  we  assume  the  reader  is  familiar  with  their 
notation  and  results.  Both  papers  address  the  nonlinear 
inverse  problem  of  ray  theoretic  seismology  on  a  one¬ 
dimensional  earth  The  earliest  approach  to  the  problem, 
that  of  Wiechert  and  Hergloiz  in  the  1900s  [Aki  and 
Richards,  1980],  assumes  that  an  exact  travel  time  curve  is 
available  and  that  the  earth  does  not  contain  strong  low- 
velocity  zones  (regions  where  dv(r)/dr  >  if  r ,  where  v  is 
seismic  velocity  as  a  function  of  r.  radius)  With  these 
assumptions,  there  is  a  unique  velocity  model  correspond¬ 
ing  to  the  data.  Once  stronger  low-velocity  zones  are  per¬ 
mitted,  many  models  may  satisfy  the  data  [for  example, 
see  Gerver  and  Markushevich,  1966).  Even  without  strong 
low-velocity  zones,  usually  infinitely  many  earth  models 
satisfy  the  available  discrete  imprecise  travel  time  observa¬ 
tions,  which  do  not  constitute  exact  travel  time  curves. 
The  nonuniqueness  introduced  by  the  finite  number  of 
data  and  their  contamination  by  errors  is  traditionally 
addressed  by  trying  to  delimit  the  range  of  models  that  fit 
the  data  adequately. 

For  reasons  stated  by  SPMO,  it  is  convenient  to  work 
with  rip),  the  vertical  delay  time  "is  a  function  of  ray 
parameter,  rather  than  T(X ),  travel  time  as  a  function  of 
epicentral  distance  Bessonova  et  al.  [1976]  introduced  a 
method  of  estimating  sample  means  of  r  and  their  stand¬ 
ard  deviations  at  discrete  values  of  p,  assuming  that  the 
noise  contaminating  the  travel  time  observations  is  ran¬ 
dom  and  uncorrelated  and  has  zero  mean.  On  these 
assumptions  the  sample  mean  is  approximately  Student  t 
distributed  when  p  changes  very  little  over  the  bands  of  X 
used  in  the  estimation  of  r.  The  assumption  of  independ¬ 
ent  zero-mean  random  noise  is  probably  not  valid  for  a 
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variety  of  reasons:  Travel  times  are  biased  bv  near-source 
and  near-receiver  anomalies  and  by  large-scale  hetero¬ 
geneities  within  the  earth,  picking  errors  are  likely  to  be 
systematic;  and  at  some  point  the  assumptions  of  ray 
theory  break  down.  All  these  factors  tend  to  correlate 
travel  time  measurement  errors,  some  tend  to  bias  the 
measurements,  and  without  more  information  one  should 
hesitate  to  assert  that  the  errors  are  truly  Gaussian.  How¬ 
ever,  the  approximation  is  increasingly  reasonable  when 
there  are  many  observations  with  wide  geographic  distribu¬ 
tion  and  it  enables  one  to  make  progress  on  the  problem. 

The  nonlinear  inversion  scheme  of  Bessonova  et  at. 
[1976]  and  the  linear  programming  (LP)  method  of  Gar- 
many  et  al.  [1979],  discussed  at  length  by  SPMO,  take 
confidence  intervals  derived  from  the  means  and  standard 
deviations  as  strict  limits  within  which  t  must  lie.  Dorman 
and  Jacobson  [1981]  objected  that  velocity  bounds  based 
on  this  reinterpretation  of  the  statistical  data  will  be 
erroneous.  Some  fraction  of  the  time  the  confidence 
intervals  will  not  include  ther(/7)  values  of  the  real  earth, 
the  strict  reinterpretation  does  not  allow  for  this,  and  so 
the  resulting  envelope  of  models  may  be  too  narrow.  On 
the  other  hand,  the  models  that  determine  points  on  the 
velocity-depth  bounds  tend  to  predict  values  of  rip)  that 
lie  at  the  ends  of  most  of  the  confidence  intervals.  Since 
there  is  really  a  probability  distribution  of  values  within 
the  confidence  intervals  and  the  values  at  the  ends  are  less 
likely  to  come  from  the  earth,  a  model  that  predicts  rip ) 
values  consistently  at  the  ends  is  extremely  unlikely  to 
represent  the  earth  and  the  envelope  may  well  be  too 
wide. 

We  shall  show  that  it  is  not  necessary  to  reinterpret  the 
statistical  estimates  as  strict  limits  on  r;  We  propose  a 
method,  dubbed  QP  (for  quadratic  programming),  that 
finds  velocity-depth  bounds  from  estimates  of  the  mean 
values  of  rip)  and  Xip)  at  various  p  and  their  standard 
deviations.  QP  retains  the  advantages  of  LP;  interpolation 
of  the  data  is  not  necessary,  multivalued  velocity  models 
are  automatically  excluded  from  the  inversion,  Xip)  data 
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can  be  used  concurrently  with  r(p),  and  the  radial  deriva¬ 
tive  of  velocity  can  be  constrained.  We  show  below  that 
the  proof  of  the  equivalence  of  the  spherical  and  flat  earth 
inverse  problems  given  by  SPMO  also  applies  to  the 
inverse  problems  from  statistical  data. 

For  convenience  we  shall  work  in  the  flat  earth,  follow¬ 
ing  SPMO  as  closely  as  possible  in  our  notation,  in  the 
application  to  the  earth’s  core  using  the  r(p)  data  of  John¬ 
son  and  Lee  [1985],  we  use  the  basis  functions  (pieces  of 
1/  v )  and  the  radial  derivative  constraints  proposed  by 
SPMO. 

The  QP  Method 

The  problems  of  calculating  r{p)  and  X(p)  are  non- 
lirear  with  the  customary  representation  of  the  earth  by 
v(c),  velocity  as  a  function  of  depth.  However,  Garmany 
ei  at.  [1979]  noted  that  r(p)  and  X(p)  are  linear  function¬ 
als  of  a  one-dimensional  flat  earth  model  expressed  as  the 
derivative  of  depth  with  respect  to  velocity,  dz/d\\  which 
we  call  4  =  4  (v): 

i  ip, 

T,  [{]  =  7  (ft)  -  2  J*  (lr  2  -  P,2)":  i  (v)  dv 

\ip , 

X,  [Cl  S  X(j>,)  =  2  f  p  U’-2  -  A2)''*  £(>’)  dv 

w 

where  w  is  the  surface  velocity.  The  depth  to  a  fixed  tar¬ 
get  velocity  v,  is  also  a  linear  functional  of  the  unknown 
earth  model  4: 

[{]  =  c  (v, )  =  J  4  (v)dv 

H 

Let  y  denote  the  reciprocal  of  the  smallest  p,  in  the  data 
set.  Then  y  is  the  largest  velocity  about  which  the  data 
give  us  any  information,  so  we  will  take 4(f)  to  be  defined 
on  the  interval  [w,y],  We  must  insist  that  4 ( v )  ^  0  to 
ensure  4  corresponds  to  a  single-valued  velocity  model 
v  (z ) .  We  may  exclude  flat  earth  profiles  that  correspond 
to  spherical  earth  profiles  with  low-velocity  zones  by 
requiring  4  (v)  <  a/\\  where  a  is  the  radius  of  the  spheri¬ 
cal  earth.  SPMO  derive  this  expression  for  the  constraint 
and  justify  its  use  in  inversions  for  core  structure. 

Our  data  are  sample  means  d,  and  their  standard  devia¬ 
tions  <Tj  of  t(j>,  )  for  /  =  1 . nT  and  of  X(p, )  for 

/  =  n,+  l, The  measure  of  misfit  to  the  data  we 
shall  use  is 

d,  -  XX)  2 

a. 

We  will  say  that  a  model  4(v)  fits  the  data  adequately  if 
ju.  [4 ]  ^  AT2,  where  \f  is  some  chosen  tolerance.  We  can 
estimate  the  probability  that  the  actual  rip)  and  X(p) 
predictions  of  the  real  earth  fit  the  sample  means  within 
A/2.  Let  us  assume  following  Bessonova  cl  al.  [1976]  that 
errors  in  the  travel  times  are  independent  and  normally 
distributed  and  have  zero  mean;  then  the  sample  means 
are  approximately  Student  t  distributed.  We  will  assume 
further  that  the  estimates  of  d,  have  a  large  number  of 
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degrees  of  freedom  and  that  we  have  more  than  a  few 
data  (n  >>  1).  Then  the  weighted  misfit  of  the  real 
earth’s  predictions  to  the  sample  means  is  approximately 
X,;  distributed  (chi-square  with  n  degrees  of  freedom) 
With  A/2  equal  to  the  1-a  percentage  point  of  the  x 2  dis¬ 
tribution,  the  requirement  that  /al£l  <  A/2  limits  our 
search  to  a  set  of  models  whose  predictions  include  those 
of  the  real  earth  at  the  1-a  confidence  level.  We  shall  see 
later  that  the  precise  value  of  A/2  makes  very  little 
difference  in  the  bounds  we  find. 

The  problem  of  finding  velocity  bounds  is  nonlinear  for 
a  host  of  reasons:  First,  a  nonlinear  transformation  from 
the  spherical  to  the  flat  earth  has  taken  place;  second,  the 
data  mappings  as  usually  written  are  nonlinear;  third,  the 
measure  of  misfit  to  the  data  is  quadratic;  and  finally, 
linear  inequality  constraints  are  required  to  ensure  that  the 
models  are  physically  reasonable  and  that  the  radial 
derivative  constraint  is  not  violated.  In  general,  problems 
of  this  kind  are  not  soluble.  Here,  we  show  how  a  finite¬ 
dimensional  approximation  to  the  problem  can  be  solved 
and  give  a  numerically  stable  algorithm  to  solve  it;  P.  B 
Stark  (Rigorous  velocity  bounds  from  soft  r{p)  and  X(p) 
data,  submitted  to  Geophysical  Journal  of  the  Royal  Astro¬ 
nomical  Society ,  1986  (hereinafter  referred  to  as  Stark 
(1986))  has  demonstrated  rigorously  that  the  finite¬ 
dimensional  approximation  converges  to  the  optimal 
result. 

Denote  by  U  the  set  of  flat  earth  models  4  =  4(1)  that 
satisfy  /afe]  <  A/2,  the  positivity  constraint  4 ( v )  ^  0  that 
ensures  that  the  models  are  physically  realizable  and,  if  we 
choose  to  impose  it,  the  radial  derivative  constraint 
4(0)  <  a/v.  We  construct  velocity  bounds  by  finding  the 
maximum  and  minimum  depths  at  which  a  target  velocity 
v,  can  occur  among  the  models  in  U.  We  repeat  the  pro¬ 
cedure  with  different  target  velocities  until  we  have  a  good 
description  of  the  envelope  of  acceptable  models.  This  is 
the  same  approach  used  by  LP  and  described  by  SPMO. 
The  misfit  functional  ^{4]  is  a  positive  semidefinite  qua¬ 
dratic  form  and  thus  /u[4l  ^  A/2  defines  a  convex  set  of 
models  4 .  (This  does  not  mean  that  any  particular  model 
4  ( v )  is  a  convex  function:  the  set  of  models  satisfying  the 
constraint  is  a  convex  set  in  the  space  of  models  from 
which  4  is  drawn.  See  Luenberger  [19691  about  convex 
sets  of  functions.)  The  positivity  constraint  4(v)  ^  0  and 
the  radial  derivative  constraint  4 (v )  <  a/v  are  both  linear 
inequality  constraints;  hence  they  too  describe  convex 
sets.  U,  the  intersection  of  these  three  sets  (the  set  of 
models  that  fit  the  data  adequately,  represent  single-valued 
velocity  models  and  have  a  nonpositive  radial  derivative  in 
the  spherical  earth)  is  also  a  convex  set.  The  depth  to  the 
target  velocity  v, ,  Z,  [4],  is  a  linear  functional  of  the 
model  4;  linear  functionals  are  convex.  Our  task  is  to  find 
the  extrema  of  the  convex  functional  Z,  [4]  over  the  con¬ 
vex  set  of  models  U.  The  familiar  theorem  that  local 
extrema  achieve  the  global  extremal  values  applies  and  we 
conclude  that  the  minimum  and  maximum  depths  are 
unique. 

We  now  know  enough  to  establish  the  equivalence  of 
the  spherical  and  flat  earth  inverse  problems  using  statisti¬ 
cal  data.  SPMO  proved  the  equivalence  for  the  strict  data 
problem;  their  proof  relies  upon  the  first-order 
equivalence  of  the  functional  derivatives  of  radius  and 
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TABLE  1  .V  (p )  Data  Used  in  Some  of  the  Inversions,  Abstracted 
From  the  X ip)  Observations  of  Johnson  and  Lee  [19851 


p , s  km  1 

X ,  km 

ff|,  km 

<r2,  km 

002922 

7273.4 

380  216 

268.8 

0  03013 

69850 

380  216 

219.5 

0  03095 

59539 

380.216 

143  7 

003265 

6661  8 

380.216 

219.5 

0.03347 

6843.3 

380.216 

190.1 

See  Figure  2  The  values  of  ray  parameter  are  given  as  the  flat 
earth  values  reduced  to  the  surface  of  the  core  (3480  km).  The 
X  (p)  sample  means,  labeled  T ,  are  values  reduced  to  the  surface 
of  the  core  by  subtracting  the  predictions  of  the  PREM  anisotropic 
earth  model  [Dziewonski  and  Anderson.  198!].  The  standard  devia¬ 
tions  o’ ,  are  referred  to  in  the  text  as  the  weaker  Xip)  constraints, 
a 2  are  the  tighter  X(p)  data.  The  values  of  cr,  correspond  to  the 
confidence  intervals  for  Xip)  used  by  Stark  et  at.  [19861. 

depth  with  respect  to  £  and  upon  the  convexity  of  the  set 
of  solutions  that  satisfy  the  strict  data,  the  positivity  con¬ 
straint,  and  the  radial  derivative  constraint.  The  depth 
functional  Zv  [f]  and  the  radius  functional  are  the  same 
for  the  statistical  data  problem  as  for  the  strict  data  prob¬ 
lem  because  we  are  using  the  same  representation  of  the 
earth,  £(v)  =  dz/dv.  We  have  just  seen  that  the  set  U  of 
models  that  satisfy  the  statistical  data  within  A/2  and  the 
positivity  and  radial  derivative  constraints  is  convex,  so 
the  proof  given  by  SPMO  applies  to  this  problem  as  well. 

How  may  we  find  the  minimum  and  maximum  depths 
to  v,  ?  A  slightly  different  perspective  makes  the  job  fairly 
Straightforward,  although  it  is  rather  expensive  computa¬ 
tionally.  We  shall  look  at  the  models  that  fit  the  data  best 
subject  to  the  additional  linear  constraint  that  they  reach  a 
certain  depth  at  the  target  velocity  v,:  The  penalty  func¬ 
tional  will  be  the  misfit  to  the  data,  not  the  depth  to  the 
target  velocity.  All  the  constraints  are  then  linear  and  the 
only  nonlinearity  is  in  the  new  penalty  functional,  which  is 
quadratic  in  the  unknown  model  £. 

Consider  the  earth  model  £*(v),  where 
0  <  £*(v)  <  a/v,  that  has  the  smallest  /*[£],  the 
weighted  misfit  to  the  data  means  dt.  If  we  approximate 
the  problem  in  finite  dimensions  by  writing  (  as  a  linear 
combination  of  a  finite  set  of  basis  functions,  then  an 
approximation  to  £*  can  be  found  by  quadratic  program¬ 
ming  with  linear  inequality  constraints.  Stark  (1986) 
proves  that  for  any  reasonable  choice  of  basis  functions, 
the  results  obtained  by  increasing  the  number  of  basis 
functions  used  in  the  finite-dimensional  approximation 
converge  to  the  correct  answer  for  the  infinite-dimensional 
problem.  The  inverse  problem  is  consistent  using  that  set 
of  basis  functions  provided  n  (£*]  <  A/2.  The  best  fitting 
finite-dimensional  model  £*  associates  the  depth 
z*  =  Z,  [{*]  with  the  target  velocity  v, .  Provided  the  prob¬ 
lem  is  consistent,  z*  is  an  upper  bound  on  the  least  depth 
to  v,  and  a  lower  bound  on  the  greatest  depth  to  v, . 

We  prove  in  Appendix  A  that  if  we  add  the  linear  con¬ 
straint  that  the  model  reach  a  greater  depth  than  z*  at  v, , 
the  best  fitting  model  that  we  can  then  find  will  have  a 
larger  misfit  /*[{].  If  we  make  the  model  attain  a  still 
greater  depth,  the  misfit  will  continue  to  grow.  The  same 
thing  happens  if  we  require  the  model  to  have  smaller  and 
smaller  depths  than  z*.  Since  the  constraint  that  the 
model  arrive  a(  a  certain  depth  at  v,  is  linear  (Z,  [<)  is  a 


linear  functional),  finding  the  best  fitting  model  that 
satisfies  the  positivity  and  radial  derivative  constraints  and 
that  achieves  a  certain  depth  at  v,  is  another  quadratic  pro¬ 
gramming  problem  with  linear  inequality  constraints. 

The  statistical  data  problem  thus  can  be  solved  with  a 
single-parameter  search:  starting  with  z*,  the  depth  to  v, 
achieved  by  the  overall  best  fitting  model,  we  add  the  con¬ 
straint  that  the  model  attain  a  slightly  larger  depth  and 
find  the  best  fitting  model;  we  continue  increasing  the 
depth  until  the  best  fitting  model  subject  to  the  constraint 
has  a  misfit  larger  than  A/2.  The  depth  at  which  A/2  is  first 
exceeded  is  the  maximum  depth  to  v,  in  that  discretiza¬ 
tion.  Similarly,  by  decreasing  the  depth  until  Af2  is 
passed,  we  may  find  the  least  depth  to  v,  in  the  discretiza¬ 
tion.  The  monotonicity  of  the  misfit  with  changes  in  the 
depth  constraint  lets  us  stop  the  search  as  soon  as  A/2  is 
overrun:  the  misfit  will  not  fall  again. 

Application  to  the  Earth  s  Core 

We  implemented  QP  as  described  in  Appendix  B  on  the 
National  Science  Foundation  San  Diego  Supercomputer 
Center  Cray  X-MP/48.  The  computational  requirements 
of  the  algorithm  are  fairly  heavy,  each  set  of  bounds 
presented  required  about  30  minutes  of  central  processor 
time.  The  inversions  that  follow  use  the  means  and 
standard  deviations  of  the  25  uninterpolated  t(j>)  data 
tabulated  by  Johnson  and  Lee  [1985],  All  our  inversions 
employ  the  radial  derivative  constraints  and  1/ v  basis 
functions  advocated  by  SPMO.  SPMO  used  an  extra  pair 
of  basis  functions  bracketing  the  target  velocity  to  enhance 
the  numerical  efficiency  of  LP  (Stark  (1986)  provides  a 
theoretical  explanation  of  this  effect);  we  have  followed 
their  practise.  We  used  100  basis  functions  in  the  prelim¬ 
inary  expansion  and  started  the  inversions  at  a  radius  of 
3480  km  with  a  minimum  surface  velocity  tv  =7  km  s-1. 
as  they  did.  The  smallest  number  of  degrees  of  freedom 
in  the  r(p )  estimates  is  115,  and  many  estimates  were 
derived  from  thousands  of  observations,  so  approximating 
the  distributions  of  the  sample  means  by  Gaussians  is  rea¬ 
sonable.  The  approximation  is  improved  further  by  the 
summation  over  the  25  data.  The  five  X{p )  means  and 
uncertainties  that  we  use  to  refine  the  structure  near  the 
inner  core  boundary  in  some  of  the  inversions  that  follow 
are  tabulated  in  Table  1 .  We  somewhat  arbitrarily  ascribed 
two  sets  of  uncertainties  to  the  X  ( p )  observations  of  John¬ 
son  and  Lee  [1985]  because  the  data,  from  an  array  study 
in  central  Arizona,  are  relatively  few  in  number  and  may 
not  be  very  representative  of  the  spherically  averaged 
earth. 

When  we  refer  to  statistical  bounds  at  the  99.9%  or  95% 
confidence  level,  we  mean  that  we  have  set  M2  equal  to 
the  appropriate  percentage  point  of  the  \2  distribution. 
Equating  n  with  the  number  of  data  is  appropriate  because 
we  are  not  estimating  a  model,  nor  indeed  reducing  the 
number  of  degrees  of  freedom  at  all;  we  are  estimating  a 
bound  on  a  property  of  the  sel  of  models  satisfying  the 
data  and  additional  constraints.  We  used  values  of  \2 
from  Abramowitz  and  Stegun  [1965];  they  range  from  about 
37  to  about  60  for  25  and  30  degrees  of  freedom.  A 
glance  at  Figure  1,  a  representative  plot  of  the  minimum 
misfit  to  the  data  as  a  function  of  the  depth  that  the  model 
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Constrained  depth  (km) 

Fig.  1  Representative  plot  of  versus  depth  the  model  is  con¬ 
strained  to  achieve,  from  the  inversion  of  the  25  r  (/> )  data  with 
target  velocity  r,— 18  km  _l.  The  minimum  misfit  of  0  8611 
occurs  at  2049  km 

is  made  to  attain,  shows  that  the  misfit  passes  rapidly 
through  that  span  with  only  a  slight  change  in  depth  so  we 
do  not  expect  the  bounds  at  those  confidence  levels  to  be 
very  different.  As  we  proved,  the  smallest  misfit  is  mono¬ 
tonic  in  the  depth  that  the  model  is  constrained  to  reach. 
The  best  fitting  model  {*(»  )  has  a  misfit  of  about  0.86  and 
reaches  a  depth  of  2049  km  at  the  flat  earth  velocity  18 
km  sr1;  it  is,  however,  built  unattractively  from  steps  in 
the  spherical  earth  velocity.  It  is  usually  true  that  models 
with  the  smallest  misfit  to  the  data  are  disenchantingly 
rough.  A  common  alternative  to  finding  bounds  on  the 
set  of  models  that  satisfy  the  data,  as  we  do  here,  is  to 
seek  the  smoothest  model  that  fits  the  observations  ade¬ 
quately  (see  Constable  et  al.  [1987],  for  example).  Either 
approach  discourages  us  from  attaching  too  much 
significance  to  accidental  properties  of  a  particular  model 
Figure  2  compares  QP  and  LP  bounds  based  on  the  25 
data  for  r(p)  alone.  The  dotted  bounds  are  those 
obtained  by  SPMO  using  LP  with  radial  derivative  con¬ 


straints.  The  solid  lines  are  QP  bounds  at  the  99  9% 
confidence  level  (A/2  =  52.6)  and,  as  we  might  have 
predicted  from  the  steep  misfit  functional,  they  are  indis¬ 
tinguishable  from  the  QP  bounds  at  the  95%  confidence 
level  (A/2  =  37.7)  in  a  diagram  of  this  small  size.  The 
similarity  of  these  bounds  supports  our  use  of  the  approxi¬ 
mation  that  the  individual  data  errors  sum  to  a  distri¬ 
bution  since  the  results  are  not  sensitive  to  the  precise 
tolerance  A/2  we  choose.  In  places  the  QP  bounds  lie  out¬ 
side  the  LP  bounds  because  QP  allows  the  misfit  to  occur 
in  the  most  advantageous  place  while  LP  limits  the  misfit 
at  each  p  independently.  In  other  places  the  LP  bounds 
are  outside;  this  is  because  the  models  produced  by  LP 
tend  to  have  data  predictions  along  the  ends  of  the 
confidence  intervals.  QP  will  not  permit  this  since  such 
predictions  are  jointly  extremely  improbable  As  an 
interesting  note,  geometrical  constructions  of  the  kind 
SPMO  used  with  the  strict  LP  bounds  can  not  be  applied 
to  the  statistical  bounds:  there  is  no  reason  to  expect  the 
bounds  themselves  to  meet  the  same  physical  constraints 
as  the  models  in  the  statistical  problem.  In  the  strict  prob¬ 
lem  one  knows  that  bounds  using  the  radial  derivative 
constraint  themselves  satisfy  the  radial  derivative  con¬ 
straint,  here  that  is  clearly  false.  The  slight  narrowing 
within  the  inner  core  where  the  bounds  violate  the  radial 
derivative  constraint  is  due  to  the  presence  of  a  datum 
with  particularly  small  standard  deviation. 

We  have  used  the  five  looser  X(p)  data  of  Table  1  to 
try  to  reduce  the  width  of  the  envelope  near  the  inner  core 
boundary.  Figure  3  compares  the  results  at  the  99.9% 
confidence  level  (A/2  =  59.7,  solid  line)  with  the  LP  solu¬ 
tion  (dashes)  using  the  strict  bounds  on  L(p)  given  by 
SPMO.  The  QP  results  using  the  five  weaker  X(p)  con¬ 
straints  are  essentially  identical  to  the  results  in  Figure  2 
using  only  the  25  t(j>)  data  The  five  weaker  X(p)  con¬ 
straints  make  a  significant  difference  to  the  strict  LP 
bounds  near  the  inner  core,  but  they  are  loo  loose  to 
affect  the  QP  results.  We  assigned  more  optimistic  esti¬ 
mates  of  the  standard  deviations  of  the  Xip)  observations 
(Table  1)  to  improve  the  bounds  and  inverted  again.  Fig- 


Fig  2  Bounds  based  on  the  25  r(p)  data  The  solid  lines  are  the  statistical  bounds  at  the  95% 
confidence  level;  the  bounds  al  the  99.9%  confidence  level  are  indistinguishable  at  this  scale  The 
dotted  lines  are  the  strict  bounds  obtained  by  SPMO  from  the  99.9%  confidence  intervals  of  John¬ 
son  and  Lee  (1985)  The  two  statistical  bounds  are  so  similar  because  *2  changes  so  abruptly  with 
depth  (Figure  I)  Also  note  that  while  the  models  are  constrained  to  be  monotonic,  the  statistical 
bounds  need  not  be  monotonic  although  the  strict  bounds  must 


Stark  and  Parker  Vtux  m  Borstxs  From  SfAnsru  \i  Data 


Fig.  3.  Statistical  bounds  at  the  99.9%  confidence  level  using  the  25  r  (p )  data  alone  and  using  in 
addition  the  five  loose  X(p)  estimates  from  SPMO  using  Johnson  amt  Lee's  11985)  data  On  this 
scale  the  results  are  not  distinguishable;  both  are  represented  by  the  solid  lines  The  dashed 
bounds  are  the  strict  bounds  from  SPMO  using  both  the  r(y)  and  X  (p )  data  The  errors  assigned 
to  the  estimates  of  X  Ip  I  are  too  large  to  change  the  statistical  bounds,  although  they  have  a 
significant  effect  on  the  strict  bounds  near  the  inner  core  boundary 


ure  4  plots  the  results  at  the  95%  confidence  level 
(M:  =  43.8.  dashes).  The  solid  lines  represent  both  the 
results  from  the  25  r(p)  data  alone  and  using  the  weak 
X(p)  constraints.  The  tighter  X(p)  data  narrow  the 
bounds  particularly  near  the  inner  core  boundary  but  also 
generally  throughout  the  core.  We  tried  to  invert  the 
corresponding  strict  data  bounds  with  LP.  but  the  revised 
data  were  inconsistent  even  using  200  basis  functions; 
doubtless  this  is  why  they  have  a  strong  effect  on  the  QP 
inversion  (This  illustrates,  however,  that  QP  is  less  sen¬ 
sitive  than  LP  to  the  estimation  ol  data  errors.)  The 
smaller  X(p)  error  estimates  are  probably  too  optimistic 
especially  with  reference  to  spherically  averaged  earth 
structure;  we  therefore  prefer  the  velocity  bounds  based 
on  the  25  rip)  and  five  weaker  X(p)  data  (solid  lines). 

The  doited  line  in  the  middle  of  the  bounds  is  the 


PREM  model  of  Dziewonski  am!  Anderson  11981] 
Although  PREM  lies  inside  the  corridor,  its  weighted 
misfit  to  the  sample  means  of  the  25  rip)  data  is 
immense.  This  demonstrates  that  an  arbitrary  model 
within  the  corridor  will  not  necessarily  fit  the  data  Every 
velocity-depth  point  on  or  within  the  bounds  is  consistent 
with  the  data:  each  is  contained  in  some  model  that  fits 
the  data  However,  many  models  in  the  corridor  are 
invalid.  Within  the  finite-dimensional  approximation,  the 
data  rule  out  every  point  outside  the  corridor;  as  the 
approximation  improves,  the  bounds  move  slightly  out¬ 
ward.  We  tried  unsuccessfully  to  bring  the  predictions  of 
PREM  into  agreement  with  a  x:  measure  of  misfit  to  the 
data  at  the  99.9%  level  with  a  baseline  shift:  there  is  still 
some  inconsistency  between  short-  and  long-period 
seismic  data. 


Fig  4  Statistical  bounds  at  the  95%  confidence  level.  The  solid  lines  are  the  pair  of  bounds  found 
using  the  r Ip)  data  alone  and  using  the  five  weaker  X(p)  constraints,  as  in  Figure  2.  The  dashed 
bounds  use  instead  the  five  tighter  X(p)  estimates  in  Table  1  Note  the  profound  effect  near  the 
inner  core  boundary  and  the  slight  general  narrowing  In  contrast,  changes  to  the  Xlp)  data  affect 
the  strict  bounds  only  near  the  inner  core  boundary  The  PREM  earth  model  (Dzicwonski  and 
Anderson.  1981)  appears  as  the  dotted  line  for  reference  PREM  does  not  satisfy  the  rip)  data, 
even  at  the  99  9%  confidence  level 
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Discussion 

The  QP  method  allows  statistical  estimates  of  rip)  to  be 
used  in  a  manner  that  is  more  consistent  with  their  deriva¬ 
tion  than  the  previous  methods  of  Bessonova  ei  at.  11976) 
and  Garmany  et  al.  [1979],  On  the  basis  of  25  core  r(p) 
data  from  Johnson  and  Lee  [1985],  QP  inversions  at  the 
99.9  and  95%  confidence  levels  find  a  corridor  of  velocity 
models  about  0.25  km  s'1  wide  and  limit  the  location  of 
the  inner  core  boundary  to  approximately  1227  —  1290  km. 
The  velocity  jump  at  the  inner  core  boundary  is  about 
0.4— 0.8  km  s'1.  QP  is  less  sensitive  than  LP  to  the  esti¬ 
mation  of  data  errors.  In  places  the  QP  bounds  are  wider 
than  the  corresponding  linear  programming  bounds;  in 
places  they  are  lighter.  Overall  there  is  not  much 
difference.  This  does  not  indicate  that  LP  and  QP  will 
always  give  comparable  results:  The  nonlinearity  of  the 
problem  prevents  one  from  predicting  what  would  happen 
with  different  data. 

While  QP  brings  us  closer  than  any  previous  method  to 
a  completely  consistent  use  of  the  scattered  travel  time 
observations  that  are  available,  several  steps  remain.  The 
sample  means  of  r(p)  are  found  by  averaging  over  small 
bands  of  X  [Bessonova  et  al.,  1976);  through  the  unknown 
X(p)  function  this  corresponds  to  a  weighted  average  in 
p .  The  averaging  should  be  incorporated  into  the  inver¬ 
sion  process:  We  ought  to  require  that  models  predict 
acceptable  average  values  of  t  (p )  over  appropriate  bands 
of  p.  The  present  method  of  estimating  rip)  will  not 
allow  this;  it  seems  likely  that  another  method  could  be 
devised  whose  averaging  in  p  is  more  easily  quantified. 
The  current  estimation  procedure  also  requires  (hat  (he 
errors  in  the  travel  time  observations  have  zero  mean  and 
be  independent  and  normally  distributed.  It  might  be  pos¬ 
sible  with  a  great  deal  more  data  to  estimate  the  true  error 
distribution  of  our  measurements,  but  this  probably  could 
not  be  done  independently  of  a  reference  model,  which 
begs  the  question. 

Appendix  A:  Quasi-Convexity  ot  p 

Here  we  prove  that  as  the  depth  to  v,  is  constrained  to 
be  further  and  further  from  r*,  the  depth  achieved  by  the 
model  £*  that  minimizes  the  misfit  p[(l.  the  misfit 
increases  monotonically.  This  is  equivalent  to  showing 
that  the  function  of  :  defined  by  finding  the  minimum  of 
among  the  models  £  satisfying  the  positivity  and 
radial  derivative  constraints  and  the  constraint  that 
Z,  k)  =  z,  is  a  quasi-convex  functional  of  z  (see  Bazaraa 
and  Sherry  [1979),  about  quasi-convexity)  Let  P  denote 
the  convex  set  of  models  that  satisfy  the  positivity  and 
radial  derivative  constraints.  Define  5  to  be  the  subset  of 
models  in  P  that  reach  the  depth  z  at  velocity  v, .  that  is 
the  dements  of  P  that  satisfy  the  additional  constraint 
that  Z,k)  =  z.  S:  is  obviously  convex.  By  definition 

6  S:‘  Let  £'  be  a  model  that  minimizes  the  convex 
functional  ^kl  over  the  convex  set  S  We  will  derive  a 
contradiction  from  the  assumption  that  there  is  a  model 
£•'  €  S"  such  that  <  Mfc'l-  where  z1  is  between  z* 
and  r‘,  i.e..  z*  <  z1  <  z:  or  >  :  >  z;  All  convex 
linear  combinations  of  and  satisfy  the  convex  posi¬ 
tivity  and  radial  derivative  constraints  since  c *  and  y1  both 


do.  In  particular,  ’  =  a£3  +  (l-«)£*  €  S:'  for  a  = 
(z'—z,)/(z1—z*)  €  [0,1]  because  then  z1  =  a:;  + 

(1— a)z*  and  Z,  k)  is  linear  in  £  The  misfit  functional 
^t[£]  is  convex,  so  by  definition 

p[i']=p\  «£2+  (1— a  )£*] 

<  «m[£J)  +  (1-aVk’l 

<  a/zk1]  +  (1-a  )p  k*] 

<  a/xk1)  +  d-aVk1)  “  M k '1 

was  defined  to  be  the  model  in  S:'  that  minimized  p  [{] 
so  we  have  reached  a  contradiction.  Figure  1  is  a 
representative  plot  ofyi  as  a  function  of  z. 


Appendix  B:  Numerical  Implementation  of  QP 

Our  FORTRAN  implementation  of  QP  is  based  on  the 
algorithm  NNLS  [Lawson  and  Hanson.  1974]  Nonnega¬ 
tive  least  squares  (NNLS)  solves  the  problem 

min  II A  x  -  bll 

*  s  o 

where  «  E  R'.b  f  R™  and  A  is  a  matrix  of  m  rows  and  t 
columns.  NNLS  is  a  tremendously  robust  program  even 
when  a  large  number  of  variables  are  used,  so  much  so 
that  SPMO  used  a  weighting  scheme  similar  to  the  one  we 
will  describe  to  code  LP  by  simulating  linear  programming 
with  NNLS. 

To  find  the  finite-dimensional  approximation  to  the  best 
fitting  model  £*,  we  pass  NNLS  the  following  matrix  A  : 


/  / 


0 


and  the  vector  b: 

c 

y  )f  d 


/  is  the  L  by  L  identity  matrix,  where  L  is  the  number  of 
basis  functions.  The  second  /  matrix  is  used  to  introduce 
a  set  of  L  positive  slack  variables  that  impose  upper 
bounds  on  the  coefficients  in  the  basis  expansion  (see 
Bazaraa  and  Sherry  [1979],  for  a  discussion  of  slack  vari¬ 
ables)  The  diagonal  matrix  of  weights,  M  .  accounts  for 
the  different  standard  deviations  of  the  data: 
W„  =  { l/<r, .  i—  j  ;  0  otherwise  ) .  The  matrices  r  and  X 
map  the  coefficients  of  the  basis  expansion  for  C  into  their 
rip)  and  X(p)  predictions  (for  the  exact  expressions 
using  the  1/c  basis  functions,  see  the  definitions  of-  and 
X,  in  the  fourth  section  of  SPMO)  We  bound  the 
coefficients  of  the  model  to  enforce  the  radial  derivative 

inequality  constraint  with  the  vector  c  =  (a ,  a . a)1  t 

R'  ,  where  a  is  the  radius  of  the  body  of  interest  (here 
3480  km.  the  core  radius).  The  small  positive  constant  y 
downweights  fitting  the  data  versus  satisfying  the  radial 
derivative  constraints  (The  radial  derivative  inequalities 
are  then  satisfied  almost  exactly .)  The  sample  means  of 
the  data  comprise  the  vector  d  The  first  /.  elements  of 
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the  unknown  vector  x  are  the  coefficients  of  the  basis 
expansion  we  want;  the  latter  L  are  slack  variables  men¬ 
tioned  above,  which  are  needed  to  imp.  e  the  radial 
derivative  inequality  constraints.  NNLS  automatically 
forces  the  unknown  to  be  nonnegative  This  in  turn 
ensures  that  C,(v)  >  0  through  our  choice  of  basis  func¬ 
tions.  In  general,  the  radial  derivative  constraints  will  be 
violated  slightly  since  NNLS  minimizes  the  two  norm  of 
the  misfit  to  b;  however,  with  y  =  Iff  1  they  were  never 
violated  by  more  than  a  part  in  10"  in  our  applications. 
Once  we  have  found  the  best  fitting  model,  we  may  con¬ 
strain  that  the  model  to  attain  the  depth  r  at  velocity  v,  by 
adding  to  A  a  row  that  is  the  finite-dimensional  represen¬ 
tation  of  Z,  [•]  (see  the  expression  for  Z,,  in  the  same  sec¬ 
tion  of  SPMO)  and  adding  z  as  a  corresponding  element 
ofb  For  numerical  stability  it  is  important  that  this  new 
row  be  inserted  above  the  rows  downweighted  by  y  [Lar¬ 
son  and  Hanson.  1974],  A  priori  all  we  know  about  the 
misfit  as  a  function  of  the  constraint  depth  is  that  it  is 
monotonic  about  the  best  fitting  depth.  We  used  a  bisec¬ 
tion  method  to  find  r  such  that  the  minimum  of/z  l{|  over 
S'  is  equal  to  A/-  because  it  is  guaranteed  to  converge 
More  sophisticated  search  algorithms  would  probably  not 
increase  the  efficiency  much  since  the  misfit  is  so  fiat  near 
the  best  fitting  depth  (Figure  I).  The  iteration  involved  in 
the  bisection  and  the  large  number  of  points  needed  for 
an  accurate  description  of  the  bounds  make  the  computa¬ 
tional  requirements  of  the  method  fairly  heavy  compared 
with  LP 
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